Power pool: how grid computing can benefit bio

By Staff Writers
Tuesday, 23 April, 2002


In early 2000, Cereon Genomics had a serious situation on its hands: It was running out of computing power.

Cereon, based in Massachusetts, combines genomics research tools with high-speed computing to discover genes for enhancing farmers' crops. Historically, Cereon, established in 1997 as a subsidiary of Monsanto, ran gene-discovery applications on its largest mainframe. But advances in genomics tools and lab processes caused Cereon's data production to expand at a fantastic rate, until it became too much for the mainframe to handle. "We were awash in terabytes," says Cereon COO Mark Trusheim. "We have to discover our products quickly and be first to market, and our ability to understand the raw data created by all our genomics tools became a huge bottleneck in our research pipeline."

Cereon needed more computing power, fast. So it tapped into its existing Unix server architecture, bought a bunch of new boxes and networked a grid of processors in Cambridge and St Louis into a virtual supercomputer that company researchers could use to submit jobs from their desktops. Specialised software from Platform Computing broke large jobs into smaller computing tasks, distributed them among the CPUs in the grid and reassembled the results into a finished product. The grid was up and running by mid-2000, and Trusheim says it's been a huge benefit. "It's helped us optimise the use of the hardware we have, and we see less need to add," he says. "We've been saving millions of dollars of IT hardware cost over the last two years as we automatically load balance across processors and now physical data centres."

Cereon's solution is not completely out of left field. The idea behind grid computing (historically known as distributed computing) has been around for years. It simply means submitting massive jobs into a dispersed network of computing resources to harness idle processor cycles for additional computing power on demand. Until the past couple of years, however, distributed computing has been primarily the province of academia and non-profit research. But its new form, grid computing, is starting to emerge in a commercial context as well. Early adopters, including biotech companies, pharmaceutical makers and chip manufacturers, are building their own grids to handle complex problems. And once the technology matures, adoption of grid computing will be more widespread, both within and among enterprises.

Grid proponents' ultimate goal is a worldwide grid which users can access over the Internet through service providers on a pay-as-you-need basis.

How it works

Grid computing uses networked clusters of CPUs connected over the Internet, a company intranet or a corporate WAN. The resulting network of CPUs acts as a foundation for a set of grid-enabling software tools. These tools let the grid accept a large computing job and break it down into thousands of independent tasks. The tools then search the grid for available resources, assign tasks to processors, aggregate the work and spit out one final result. Grid toolkits also contain middleware that enables a diverse, multi-vendor array of hardware to accept assignments and handle all the same applications.

Ian Baird, Platform's chief business architect and corporate grid strategist, says the grid is a way to get maximised utilisation of existing resources without spending millions of dollars on hardware. For example, he said, one Platform customer, a bioinformatics company, planned to spend approximately $US3 million on new hardware to expand its computing resources. Instead it spent around $US150,000 to install a grid, and it no longer needs to buy the new hardware.

Despite the potential benefits, grid-ready applications remain a rare bird. The technology best serves problems that are computationally intensive using algorithms that developers can break down into discrete computational units, such as genetic research where scientists must mathematically analyse thousands of genes in combination to find matches. And that's not a task most corporations face.

Despite its seemingly limited applicability, grid computing has generated considerable buzz. A number of major hardware vendors, including Compaq, Hewlett-Packard and IBM, have announced commercial grid-computing initiatives in the past year. Grid bundles seem to be the favoured approach, but some vendors are striving for a utility model of grid computing.

The future

Before grid computing moves into the commercial mainstream, CIOs need to learn more about the technology and its possibilities, and identify ways they can use it. But proponents claim that just about any sophisticated company can find a need for high-volume number crunching. Peter Jeffcock, Sun's group marketing manager for software products in the technical market products group, says that grid computing can help any company that does its own software development and testing. "You're running weekly, nightly and sometimes daily regression tests. If you could run them over lunch and say, 'Here's what you need to fix in the afternoon,' you could deploy the products much quicker."

But other problems need to be solved before grid computing becomes truly widespread, particularly in the context of inter-enterprise grids, utility models and ultimately a global grid. The biggest issue is security. If you're sharing the grid with other companies, you need assurances that nobody else can get to the confidential information you're throwing into the system. Standardisation also remains a challenge. A grid involves sharing distributed heterogeneous resources and bringing together a number of operating systems, vendor platforms and applications. Getting them all to talk will require new protocols. A number of non-profit groups such as the Global Grid Forum, the Globus Project and the New Productivity Initiative are working on security and standardisation issues. But until these issues are worked out, grid computing will likely remain an internal corporate effort.

Related Articles

Personality influences the expression of our genes

An international research team has used artificial intelligence to show that our personalities...

Pig hearts kept alive outside the body for 24 hours

A major hurdle for human heart transplantation is the limited storage time of the donor heart...

Breakthrough antibiotic for mycobacterial infections

The antibiotic candidate, named COE-PNH2, has been optimised to target Mycobacterium...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd