Tag Archive for 'tags'

Flickr Machine Tags for museum objects

At the museum we have been experimenting with machine tags and allowing the user to add their own photos to collection object pages. We have successfully implemented the Flickr API to display images of objects that users have tagged with our machine tags. But rather than just using an arbitrary string, Richard Barrett-Small from the V&A suggested that we come up with a convention that all cultural institutions can use. We have used the following format:

culture:[partner name]=[object ID]

So for the V&A, for example, to pull up images for an object with ID 12345 we would use culture:vam=12345 and an object at the TATE could use culture:tate=abc123

We thought it would be good to put this idea out there and see what others think. The advantage of having a convention like this would mean you can potentially predict the machine tags for each partner and different partners can embed/share images using a portion of the tag. For example, if I wanted to extract all Flickr images at the TATE I would look for images tagged with: culture:tate and I would leave out the ID to extract all images like a wild card.

Not very sophisticated. Ideally you want to be able to add more machine tags like categories so you can query all V&A images in the “painting” category. But since it will be the users who will be tagging these photos, the process should be as simple as possible.

I would love to hear from other museum people about this.

Content analysis and auto-tagging

When I posted my last entry I didn’t think the next one would be 4 months later! I have been extremely busy with work and haven’t had much time to experiment with anything during my spare time. I have been involved with the National Museums Online Learning Project over at the V&A in London and I am in the process of creating a federated search component across 9 national museums. I have been fortunate enough to also be involved with helping some of the project partners develop/improve their existing collection search pages on their own site.

Currently I am experimenting with content analysis or auto-tagging. I initially decided to follow in the footsteps of PHM’s collection search and use Open Calais to see what it believes to be significant in the object description. I have to admit I was a bit disappointed. I don’t think OC is suitable for museum content since it mostly looks for news-related keywords:

Entities
Anniversary, City, Company, Continent, Country, Currency, EmailAddress, EntertainmentAwardEvent, Facility, FaxNumber, Holiday, IndustryTerm, MarketIndex, MedicalCondition, MedicalTreatment, Movie, MusicAlbum, MusicGroup, NaturalDisaster, NaturalFeature, OperatingSystem, Organization, Person, PhoneNumber, Product, ProgrammingLanguage, ProvinceOrState, PublishedMedium, RadioProgram, RadioStation, Region, SportsEvent, SportsGame, SportsLeague, Technology, TVShow, TVStation, URL

Events and Facts
Acquisition, Alliance, AnalystEarningsEstimate, AnalystRecommendation, Bankruptcy, BonusShares, BusinessRelation, Buybacks, CompanyAffiliates, CompanyCustomer, CompanyEarningsAnnouncement, CompanyEarningsGuidance, CompanyInvestment, CompanyLegalIssues, CompanyLocation, CompanyMeeting, CompanyReorganization, CompanyTechnology, CompanyTicker, ConferenceCall, CreditRating, FamilyRelation, IPO, JointVenture, ManagementChange, Merger, MovieRelease, MusicAlbumRelease, PersonAttributes, PersonCommunication, PersonEducation, PersonPolitical, PersonPoliticalPast, PersonProfessional, PersonProfessionalPast, PersonTravel, Quotation, StockSplit

When I attempted to extract the significant keywords from the following object, I only got 2 tags back:

There are clearly other words in the description that are meaningful. What about the most obvious keyword, “dress”?!

I then tried using Yahoo’s Content Extraction service and I was much happier with the results:

Of course, the advantage of Open Calais is that it neatly groups your tags into specific categories (see the list above) but again, this being a Reuters project, it is very much news-centric and it ignores a lot of important semantic metadata we want to see with museum content.