Deep-linking, embedding, and services

From Livetrix

Jump to: navigation, search
This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.


Contents

[edit] Linking to...

[edit] ...a specific category search page

This can be interesting for faculties, to link to a search page specific to their interests.

When you browser the subject guide, the URL updates each time you choose to view another category, for example:

http://purplesearch.rug.nl/subjects.html?cat=10

Such URLs can be linked to from anywhere.

Note that this links to the category by an identifier. If the category is deleted and created again it will receive a new identifier, and old links will not work anymore.


To provide only a list of resources, that is, to hide the search field:

http://purplesearch.rug.nl/subjects.html?cat=10&nosearch=y

[edit] ...a saved item

A single saved item will not change ID, so you can bookmark and copy/paste the URL directly.

[edit] ...a saved item citation list

This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.

This can be interesting for professors and such, to hand their organized citation lists to students or other interested people.

The way in which this is done is still in flux, but will probably become a matter of URL copy-pasting as well.


It may be intersting to make this embeddable in other ways, e.g. via XML output that other systems can reshape according to their needs.


[edit] ...a search to be made on the fly

This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.

This can be useful to show someone else search results you found, since search results are cleared daily so result pages cannot be bookmarked.

(note: this will change)

A form-based search can be imitated using:

http://purplesearch.rug.nl/search.html?WRD=arctic+fish
http://purplesearch.rug.nl/search.html?WRD=word&WAU=nerbonne

Those variables current reflect metalib fields. The most interesting are probably

  • WRD: find anywhere in record),
  • WTI: find specifically in title
  • WAU: author. You should only fill in the last name, to avoid cases where variations would be filtered out
  • ISSN: to search for articles in a journal

Values should be URL-encoded (after UTF8 encoding when Unicode characters are used).


Omitting databases to search in implies automatic database choice. You can add your own, but will have to use target identifiers specific to your installation. Example:

http://purplesearch.rug.nl/search.html?WRD=arctic+fish&targetlist=RUG00080+RUG00926

[edit] ...a (simplified) OpenURL lookup

(note: this may change)

Can be useful to link someone to an article via RUG subscriptions, if you have the citation in an external resource and are willing to manually/automatically rewrite them into.

Currently, this can be done with a URL like the following:

http://purplesearch.rug.nl/openurl.direct?issn=0024-3841&year=2008&volume=118&issue=4&spage=499

The variables you can use include:

  • aulast: Last name
  • year
  • atitle: Article title (semi-redundant with pages)
  • issn: (semi-redundant with jtitle)
  • jtitle: (mostly redundant if ISSN is given)
  • volume
  • issue
  • spage: start page. (pages usually require volume and issue to be meaningful)
  • epage: end page
  • isbn

[edit] Exposed data


Within admin/
(to avoid abuse, some of this is heavy on the CPU and database)
:
  • targetoverview.html - Overview of targets (representative strings, average result amount, contained materials, basic usage count)
  • targetoverview.targetstrings - prints top so-many representative strings for one target, linked to from the last.

[edit] Subject Codes

  • expose.categories - dynamic list of co-occurring category codes. See also the related service, mentioned below.

[edit] Exposed services

[edit] Search API

Another project at the Groningen University Library is an alert system that regularly fetches all results from a single resource for a specific ISSN in a specific year and checks for new articles.

From intent to a search to result XML is a many-step process, in pseudocode roughly (see also Metalib notes):

sess = login_request(user,pass)
group_number      = find_request(sess, query, databases_ids)
(set_number, ...) = find_group_info_request(sess, group_number)
xml               = present_request(sess, set_number, set_entry)

Since you have to store various pieces of information on the client end and this is (perhaps overly) complex for many simple utilities, we created a simple wrapper API, detailed below.


[edit] Doing a search

A HTTP request to:

/allresults.xml?year=2001&issn=0001-2998&db=RUG00008

...will trigger a search, wait for it to finish (which may take up to a minute), then return a very basic XML document detailing the result status of that search.

The URL takes the database(s) to search in:

  • a single db (by Metalib ID; note these are installation-specific)

and query elements:

  • issn
  • year
  • author
  • title
  • anywhere (word/phrase in any part of the record, as determined by the source)
  • isbn
  • subject (probably of limited use)

All are optional, one is necessary for search, and you can pass in more than one for all of these (though that doesn't always make sense).

Query fields are cleaned for characters that are problematic to Metalib, and some are rewritten into formats that are more likely to give results.

If you want to avoid any rewriting, you can use direct to pass in a metalib query you have already formatted. See also Design considerations#Query_features_and_limitations about some things you can and can't do.

[edit] Result status

The code waits for the search to finish. This means this interface is primarily useful when you know a source will react relatively quickly. If you want an interface more like the livetrix interface, contact us. On a successful search, the request will return something like:

<result state="DONE" set_number="004384" amount="35" errormessage=""/>

The state will be one of:

  • DONE (successful search),
  • STOP (timeout in search) or
  • ERROR (search failed), which implies the errormessage attribute has something to say.

The error message attribute hands through an error message if the search was problematic within metalib. On a more serious error, for example when the service could not reach Metalib, it will return:

<error>a message</error>

See also Metalib for some related details.

[edit] Fetching results

Since a 'get all results' command could easily take much too long (and in fact time out) on slower sources, and 'get all' may be very many indeed, you should to do this in chunks. This is more coding, but partly just unavoidable.

Based on the details about a search, you can fetch chunks of results by querying:

/page.xml?sn=004384&se=1&amt=10             #fetches 1-10
/page.xml?sn=004384&se=11&amt=10            #fetches 11-20
/page.xml?sn=004384&se=21&amt=10            #fetches 21-30
/page.xml?sn=004384&se=31&amt=5             #fetches 31-35

sn is set_number, se is starting entry, amt is how many records to fetch starting at that start entry number.

Having that last amt be lower is significant: Metalib will complain and fail if you try to fetch a range of documents with indices beyond the end of results. It is not (directly) aware of the amount of results, only you are.

Note that metalib may occasionally pass through invalid XML to you. This interface does not do any correcting (it may in the future).

[edit] Related subject codes

This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.

Allow lookups of code meanings, and reports codes that commonly co-occur with one. Currently supporting cross-lookups between LCC subject codes, Dewey codes and BCL codes.

Examples:

http://purplesearch.rug.nl/expose.similarcodes?code=149
http://purplesearch.rug.nl/expose.similarcodes?code=44.75
http://purplesearch.rug.nl/expose.similarcodes?code=RJ506

Since the codes currently supported have exclusive formats, you do not need to specify what it is. For future compatibility, you may want to force the code type, for example:

http://purplesearch.rug.nl/expose.similarcodes?code=dewey:149
http://purplesearch.rug.nl/expose.similarcodes?code=bcl:44.75
http://purplesearch.rug.nl/expose.similarcodes?code=lcc:RJ506

Since it looks up all the code context for all relations, it may take up to a second or so. You can make it slightly faster by adding &nolookup=y to the URL.


Note that our information on LCC codes does not go beyond the dot.

Experimental version. Not sure yet what to do with detailed (dotful) LCC numbers.

[edit] Impact factor

This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.

You can retrieve the impact factor for a journal, by ISSN. Since this is not public information, this is restricted to the institution. Example:

http://purplesearch.rug.nl/expose.impactfactor?issn=00280836
Which returns plain-text consisting of a single number, for example 29.273
(high because that's Nature)
, or an plain-text error, probably telling you that we do not know.

[edit] Administration

[edit] Settings, logs, documentation, statitics

These are links in the administration view, available in the /admin/ directory under the main installation path.

[edit] Target and category management

Available in /subjectadmin.html under the main installation path. Restricted to certain users.


[edit] Playthings

Developer test pages, firther exposed data, etc.

[edit] Phrase relations

This article (or section) is a stub.
It is here because it was planned, and some notes were dropped here.

Reports:

  • phrases that sound similar
  • co-ocurring phrases
  • a guess as to what field the phrase is likely in
  • hypernyms, hyponyms
  • which resources seem to be useful in that they have resources for it

As a browsable thing:

As an XML service:


Personal tools