Digital Duct Tape
1999; American Association for the Advancement of Science; Volume: 284; Issue: 5418 Linguagem: Inglês
10.1126/science.284.5418.1293b
ISSN1095-9203
AutoresRobert Sikorski, Richard Peters,
Tópico(s)Scientific Computing and Data Management
ResumoAs labs increasingly embrace automation and computer technologies, the need for tools to manage the growing amount of data becomes more and more important. Usually, data management requires a database to serve as a repository that can be searched, saved, and edited. However, getting data into a database can be a challenge in a laboratory setting, where information can come from many disparate sources. What is needed is an all-purpose tool to move data and manipulate the computers that store the data. Enter Perl, a powerful programming language that has become known as the “duct tape” of the Internet. Perl's ability to glue together data from just about anywhere is useful to a wide array of applications, but there are five features of Perl that make it particularly attractive for use in a biomedical laboratory. First, Perl code will run on the three major operating systems commonly found in the lab: Windows (NT and 95/98), Macintosh, and Unix (Linux). In fact, the exact same Perl code can often be simply moved from one operating system to another and run without change. Second, the basics of Perl are relatively easy to learn. You write Perl in a simple text editor, and no special tools are required. When you run a Perl program, you just feed the code to a software engine that interprets your commands and does whatever you have coded. Third, Perl is one of the best tools available to manipulate text. In fact, it has an entire internal language for pattern matching and text substitutions. Fourth, hundreds of Perl modules (or prebuilt pieces of code) exist on the Web. You can find code that makes graphics, manipulates text, runs administrative routines, and so on. And finally, Perl is supported on the Web by an open community of developers who are constantly providing new features and answering questions from experts and novices alike. Perl can help in the lab in various ways. Suppose you have reams of digital data from a scintillation counter run and you want to move all of this into an Access database for storage or transfer to a colleague. You could cut and paste all of the data into the proper fields, but that would be labor intensive and error prone. Instead, you could write a short Perl program that opened up each file, scanned for header elements and data, and entered the data into individual records in the database. The exquisite pattern-matching abilities of Perl make parsing text a straightforward task. Another example of pattern matching might be to sort through your old e-mail messages. If you use Microsoft Outlook Express as a client, your mail is stored in a special text file in the Windows/Application Data directory. With Perl, you could read the text of your mail, parse out the From:, To:, and Body: fields, and enter them in a database for safekeeping. Another example of Perl in the lab might be to manipulate DNA sequence data. In fact, if you look behind the scenes of a major sequencing lab, you are likely to find Perl tools. With Perl, you can scan rapidly through thousands of lines of sequence to find absolute matches and pattern matches—from simple, such as Type II endonucleases, to complex, such as degenerate repeat motifs. Perl can also talk to the operating system itself. If you use Windows NT, for example, you could completely automate the entry of passwords and file permissions, using Perl “administrative” modules that are easy to program. Just ask anyone running even the smallest of LANs (local area networks) how tedious system administration issues can be when you run a server, and you will see the value of a Perl solution. Perl can also be used to generate Web pages to publish data on a lab Web site. Suppose you have data on yeast strains and genotypes that are stored in an Excel spreadsheet. You would like to make Web pages to show your colleagues what strains you have and how to request them. You could use Perl's ability to talk to Windows applications to read the data from the spreadsheet and generate HTML code for the Web on the fly. In fact, because Perl can also FTP (file transfer protocol) documents across the Web, you could automatically FTP the Web pages to your Web server. And finally, because Perl works well with Web servers, you can use simple Perl scripts to capture all kinds of data through Web forms. You could use Web forms to get registrations for a meeting, requests for mouse strains, or even have potential postdocs upload resumes for your review. We have tried to touch on a few examples of Perl applications in a lab, but ultimately the best applications will likely come from interested graduate students or postdocs who see a problem and have a basic understanding of the Perl language and tools. The Internet can serve as their training ground. On the Web, you can find samples of code, software, and tips from a world of experts. We suggest starting at and The first URL is for anyone interested in the basics of Perl, and the second is a great one-stop shop for Windows users only. Virtually all of the tools you need to use Perl are absolutely free on the Web.
Referência(s)