Google Launches Code Search

The newest member of Google Labs, Google’s playground for new ideas, is Code Search, an ambitious project that aims to index billions and billions of lines of code.

Code search gives users the ability to search for publicly accessible soruce code.  According to Google, source code lives in two ways on the internet: in things like zip files and gzip, or in software control repositories such as SourceForge, Google’s code hosting, and other places.  They aim to index source code found in all of those places.

Google won’t just be indexing the zip file, but they’ll also be opening it up, unzipping it, and indexing all the individual files within in it.  

The regular Googlebot crawler is being updated to recognize these zip files, and although they are unable to give an exact figure, Google says that Code Search already contains billions of lines of code.  Such a staggering number makes you wonder how easily this code will even be able to be searched. But fear not, the smart people at Google have set it up so that Code Search can be searched by software license, programming language, and by file name.  With all three of those in place, you should be able to find the code that you are looking for. Users may also search by regular expressions and patterns of words.

Launching alongside Code Search is the Code Search API, which will allow coders and programmers to further extend the project. 

There will be no AdSense ads in the Code Search results, at least at first, and the results will not show up in regular Google searches.

Some potential uses of Google code is for developers to look and see where there code is being used. This may help to combat plagerism and software license use infractions.

Most of the code indexed by Google is open-source, and they believe that very little of it is proprietary since it is all posted in public places.  However, with anything, there does remain the possibility that some people may post other people’s code illegally.  Therefore, there may be some proprietary content included in the Code Search index.  Luckily though, they have created a way for such instances to be reported so that they can be removed from the index.

You can access Code Search directly at http://www.google.com/codesearch,via the Google Labs page, or by clicking “Advanced Search” on the Google web search.

1 Comment

  1. SeriousCoderSeriousCoder10-28-2006

    Hm, I don’t like the UI of Google’s Codesearch. It all looks kind of crude and chaotic. The Google guys should learn from http://merobase.com. Better UI, better result presentation, API-based queries… sometimes small startup seem to be more creative than good ol’ Google…

Leave a Reply