Saturday, October 17, 2009

PHP MySQL Single Field Search Script

I wanted to provide a way for users to search a limited access archival digital collection of over 36,000 records, which contained meta data and where approx 60% of the collection had machine generated OCR data.

To solve this I decided to batch load the “searchable” meta data and OCR data to a three field MySQL database and make this searchable via a PHP script. The database contained an index field, a title field and search field. The index and title fields are largely self explanatory with the index field containing a value that would allow a URL to the digital object concerned to be generated. However the search field contained a concatenation of all the relevant searchable meta data fields and then appended the machine generated OCR data, if it was available.

I then constructed a short PHP script to search the search field and return the results. I was surprised at how efficiently this process worked for our needs and decided that the basic code to search a single field MySQL database might be of value to others! Copy the code for achieving this here. If you save this file as search.php and then edit it in the appropriately commented locations, you should quite quickly get it working. (This code was actually working on my test server prior to making it available)


There is no attempt in my PHP code to evaluate the relevance of the results. They are just displayed sorted on the search field. I have since successfully used this PHP search script as the bases of more complex search scripts on other projects.

Friday, September 18, 2009

Automate Equipment Inventory to Database

It's been a while since I have posted to this blog. Mostly it has been because we had some major staff movements that have taken some time to work through to the point where making some space to write here is now somewhat realistic!

I want to share an improvement to a script I blogged in Feb 2007 which was about using a Windows Powershell script to query the WMI Win32_Product so as to retrieve information about all the software on a computer.

The original script output to a HTML file and has served us well, but I have occasionally thought that it would be great to also be able to write some or all of the data to a MySQL database. Recently I played around with this idea and managed to achieve a reasonable result.

I decided to use the .NET assembly available for this purpose from the MySQL site. The reason I decided to use the .NET assembly was that although the assembly can be installed using the installer the reality for distribution purposes was the the installer in fact copies in and then registers a .DLL file which if you want, can just be directly copied onto the target systems. Providing you specify the exact location of this file in the Powershell script, the fact that it is not "registered" does not matter!

Setup

  1. As the script writes output to a MySQL server located on another machine. You can get the setup for the MySQL server from my personal website.
  2. You can get a copy of the GetInventory_DB.ps1 Powershell script and copy this script to the desired location on the machine(s) you wish to get inventory data from.
  3. Also you will need to either install the MySQL .NET connector or download it and just copy in the .NET assembly .DLL file called "MySql.Data.dll" to the desired location on the machine(s) you wish to get inventory data from.

Note: If you use different file locations for these files than what the script uses, you will need to edit the script accordingly. Also you will need to edit the location of the MySQL server...

We schedule the script to run from the AT command once a day on the local machines when the workstations are not in use. This means that the data is up to date with any changes made in the previous 24 hours. Intially we are accessing the MySQL database using an ODBC connector and a MS Access Database. I have since written a PHP script that interacts with the MySQL database. Rename the script "search.php" and edit it in the places commented so as to reflect your database, username and password and it should work! (This script was copied directly from my test server...).





Friday, July 4, 2008

I have recently been developing a consistant way to insert metadata into images. There are of course a number of GUI tools for doing this but what I wanted was a simple script or batch file solution that would work with large collections of images. I focused on writable tags for JPEGs although I think that the solution I have developed is likely to work for other image types... (I just haven't tested this!)

After some investigation I decided to use the very capable ExifTool by Phil Harvey. Once I had downloaded the Windows Executable version of this tool and renamed it to "Exiftool.exe". I copied it into the Windows directory so as to be able to just call it from a script. I then identified the range of metadata tags I wanted to target and inserted both the tag and the data into a text file using the Exiftool syntax. I then used Windows Powershell... to write a short script to do the actual work!

A shortened version of the script follows:



# Author: Peter Stone
# Organisation: University of Waikatio
# Department: Library, Technology Support Services
# Written: 26/06/2008
# Updated: July 04, 2008
# Filename: SetImagedata_v3.ps1
# Purpose: To insert metadata into (JPEG) images using the Exiftool utility developed by
# Phil Harvey at http://www.sno.phy.queensu.ca/~phil/exiftool/

# get the tag values from file
if ($args[0] -ne $null) {
[string]$tagvaluepath = $args[0]

#Does the file exist?
if (test-path $tagvaluepath -pathType leaf) {
write-host "Getting Exiftool tags and values: $tagvaluepath" -foregroundcolor "yellow"
write-host
$tags = get-content -Path $tagvaluepath

}else{
write-error "No file at $tagvaluepath Specify a path to a file containing Exiftool syntax tags and values"
}

}else{
write-error 'Specify a path to a file containing Exiftool syntax tags and values'
}

# Get the image(s) filepath
if ($args[1] -ne $null) {
[string]$imagepath = $args[1]

}else{
write-error 'Specify a path to a directory of images or an image file'
}

#Does the file(s) exist?
if (test-path $imagepath -pathType leaf) {
# Single file
$file = $imagepath
write-host "Metadata being entered onto image $file" -foregroundcolor "yellow"
write-host
foreach ($tag in $tags) {
# Write the tag to file
if ($tag -notlike '#*') {exiftool $file $tag -overwrite_original}
}

}else{
# Get any valid files from the directory
$imagepaths = get-childitem $imagepath -Include *.JPG -Recurse sort-object
foreach ($file in $imagepaths) {
write-host "Metadata being entered onto image $file" -foregroundcolor "yellow"
write-host
foreach ($tag in $tags) {
# Write the tag to file
if ($tag -notlike '#*') {exiftool $file $tag -overwrite_original}
}
}
}



The script requires 2 arguments:
The first argument is the path to the datafile I mention above.
The second argument is to an image file OR a directory of image files.

When run the script uploads the data in the data text file. It then locates and uploads the location(s) of the file(s) to be processed. The script then writes to each image that has been targeted the metadata from the data file. In this way I now have a simple and elegant way to ensure that the images have appropriate metadata inserted into them before I upload them to the web.

A copy of the full Powershell script and the data file together with details for downloading is available from http://www.newgrove.co.nz/stuff/exiftool_script.html

Wednesday, March 26, 2008

Disk Defragmentation

Its been a while since I have found something new I have wanted to share on this site, but this little beauty is worth a mention. I have been a fan of regular defragging of hard drives for some years now and have defragged all our Windows systems nightly, using the Scheduler and Windows own "defrag" utility until now.

JkDefrag is a really impressive piece of work and it is free! It doesn't require installing, it has a commandline, GUI and screensaver version. Jeroen has kept his defrag applications focus on straightforward functionality with no bells or whistles. His rationale for how the defrag engine works makes good old fashioned sense and seems to just do the job!

I am using the commandline version at work (300 XP workstations and 3 2003 servers), together with a script that runs the utility each night on a 7 day cycle. Every 7th day I use the -a 7 switch to effect an alphabetical reordering and on the other nights I just run the application in the default mode.

Improvements like this are a bit like banking a small sum often. The improvement to the whole just gets better over time!

Monday, September 24, 2007

Federated Search

I recently attended the LIANZA conference held in Rotorua and had the good fortune to attend a session by Andy Neale of the National Library of New Zealand.

Andy spoke persuasively about the need to rationally assess what we are attempting to accomplish with our available resources. His main points were to identify the largest target group who’s needs must be addressed. Address those needs using as small a resource as possible and be clever about this – don’t reinvent stuff as it is likely that there is something out there that will work for you…

He cited using Google Coop as a federated seach engine which could well meet most needs in this area... (also see a Proof of concept for Google federating here...)

If you satisfy this group you will properly have met the needs of many others as well. This strategy will hopefully leave you with some capacity to target strategically important areas...

Tuesday, August 21, 2007

Virtual Machines Windows Services Modifier

We recently have been “playing” with virtual machines, looking to use these devices to further our interests. We have looked at the Microsoft free offerings as well as the VMware free products. From a user perspective I think the VMware software is impressive. But do have an issue with the way VMware seems to reserve memory on my Windows XP box even if the virtual machines are not running. This was most noticeable when I took the software and virtual machines home and ran them on my older less capable computer (a P2.4 with 1GB of RAM).

By default VMware services are started automatically when the PC starts up. (There are 5 of these services) This is fine if the virtual machine(s) are in use, but if we only want the virtual clients to be run manually on demand, then we would of course like to have all the host pc RAM available for normal use when the virtual machines aren’t running!

My answer to this situation is in two parts:
Change the way the VMware services are started by using the Management Console to change them from Automatic to a Manual startup. This means the services are present but not running.

I then run a very short Powershell script to start the services and once they are started open the VMware server application. Yes it takes about 20 seconds longer to start the VMware application this way but it works! I have a second script that will shut the VMware services down again if you want to reclaim RAM after using the virtual machines…

Thursday, July 26, 2007

Randomly Select Data From A List

I was asked recently to provide a process that could select patron data from a list randomly. We wanted to select a random portion (20%) of our patrons for a LibQUAL survey. The process needed to be able to randomly select a row of data (a record) for a number of times, determined by the user. In this way it would be possible to randomly select a list of names from the original list...

My solution was to write a Powershell script. The script expects to find a text file called "Possible.txt" in the same directory as the script. The script is run with one argument that is the number of lines to be randomly selected. Once run the script creates a second file called "Selection.txt" which will contain the selected lines of data in the order they were selected.

Links: