8 May 2007

Document Search With Google Search Software Development Kit

by mo

Ok… so when working with Google Desktop using the query model that my previous post described you don’t get access to all sorts of other document properties that, supposedly, available when you work directly against the Google Desktop Search SDK. (Those damn .idl files!)

The setup… so first of all you’re going to need to install Google Desktop on your machine so that it can take care of indexing files… (You probably could take the necessary Google assembly’s and register just the ones you want, and create your own windows service to do add files to the Google index… i don’t know that’s probably beyond my skill set at the moment!)

Once you’ve got Google Desktop Search installed, you’re going to need to add to COM references to your project… These guys are:

  • Google Desktop Search API 1.1 Type Library
  • Google Desktop Search Query API 1.0

google com references

Ok so in order to use the Google Query API your app has to first register with it, the Query API will then return a cookie that you need to hang on to. You’ll use this cookie to make any search requests to the Google Query API.

In order to register you need to provide a description of your application and a globally unique identifier (guid) for your application. Here’s a very rudimentary example… (I highly suggest that you don’t actually use this code, but hopefully it helps with learning)

  public static int Register( )
    Object[] description = new Object[] 
      "Title"," tests","Description","Simple tests","Icon","My Icon@1" 
    const String applicationGuid = "{5323E036-345C-4323-548D-32AA55603215}";
    GoogleDesktopRegistrar registrar = new GoogleDesktopRegistrar( );
    registrar.StartComponentRegistration( applicationGuid, description );
    object regObjObj = registrar.GetRegistrationInterface( "GoogleDesktop.QueryRegistration" );
    IGoogleDesktopRegisterQueryPlugin query = regObjObj as IGoogleDesktopRegisterQueryPlugin;
    if( query == null ) {
      return 0;
    else {
      Int32 cookie;
      cookie = query.RegisterPlugin( applicationGuid, true );
      registrar.FinishComponentRegistration( );
      return cookie;

So what’s going on?… Well, in the above code there is a hard-coded guid (“{5323E036-345C-4323-548D-32AA55603215}”) and an object array of strings. The object array is the description of your application. When you call FinishComponentRegistration() the following dialogue will pop up.

This is Google’s way of prompting the end-user of whether or not they want your application installed on their PC. If Ok is pressed you can use the cookie received in the call the line above it. If cancel is pressed a COMException is raised and that cookie is useless!

What does a cookie look like?

Well… for me the cookie was the value “1030818419”. For the testing I copied the value and stuck it in a constant… but as the SDK suggests, you will probably want to encrypt this and store it somewhere where you can read it out and decrypt it, because you’re going to need it for each search request.

  private const Int32 GoogleCookie = 1030818419;

Dude.. I got a cookie! Mmmm…..

Now let’s actually start searching… Instantiate a GoogleDesktopQueryAPI object and invoke the Query method… or QueryEx (for the ability to read and write to the index. In this example we just registered for a read-only cookie for searching!)

The Query method will return an object that contains a collection of search results. Unfortunately, there’s no enumerator or indexer so foreach doesn’t work here… sigh.

  public static void Search( String forText )
    Int32 cookie = GoogleCookie; 
    GoogleDesktopQueryAPI queryAPI = new GoogleDesktopQueryAPI( ); 
    IGoogleDesktopQueryResultSet results = queryAPI.Query( cookie, forText, "file", null );

    IGoogleDesktopQueryResultItem2 item; 
    while( ( item = ( IGoogleDesktopQueryResultItem2 )results.Next( ) ) != null ) 
      Console.WriteLine( item.GetProperty( "uri" ) );

Looks kind of ugly hey? In order to get properties on each search result you have to call the “GetProperty()” method on each result item. AND you have to pass it a string value as the property name…. Gross!

A possible way to get around this would be to create a private class or structure or string constants… This isn’t that more elegant, and in fact this could be broken out and organized into logical groups of properties.

  private class GoogleResultItemProperty
    public const String ActualWork = "actual_work"; 
    public const String AlbumTitle = "album_title"; 
    public const String Artist = "artist"; 
    public const String Assistant = "assistant"; 
    public const String Attendees = "attendees"; 
    public const String Author = "author"; 
    public const String Categories = "categories"; 
    public const String Company = "company"; 
    public const String FolderName = "folder_name"; 
    public const String Format = "format"; 
    public const String IMAddress = "im_address";
    public const String Keywords = "keywords"; 
    public const String Language = "language"; 
    public const String LastModifiedTime = "last_modified_time"; 
    public const String Length = "length";
    public const String Location = "location";

However, I keep getting COMException’s when i try to access properties that I’m sure should be there… for instance I know that I’ve set the author property on…

But when I try to access the author property on that document a COMException is raised. Why is this happening? The whole purpose of using the Google Search SDK is so that I can retrieve that property without having to directly access that file!

In conclusion, I still have much to learn about COM and the Google Desktop Search API. I’m found the documentation on the API rather hard to read an understand. (It can be found here…) Hopefully, this helps a little!