GENUKI Gazetteer maintenance
Source filesThe searches are all performed on a MySQL database using SQL queries. Changes are not made to the actual database itself, but to a set of source files which are used to replace the entire database when the need arises.
The source files are actually just a set of comma separated text files
places.csv, with one file per county.
The master files are in the usual place, contact Phil Stringer via the link at the bottom of the page if
you can't remember where that is. There are no links into
these files to make it difficult for search engines to find them, and to prevent them being
harvested by somebody else. The basic file is named
places.csv and when a county section
is being maintained by the county maintainer, this file is placed in the standard place in your county web
pages. Again contact Phil if you can't remember where that is.
A special program is used to collect all the county sections and build the database from source.
When any section has been changed, a database rebuild is required to make you change live. A simple rebuild is performed
at 6:00 am every day, so you should see your changes live the next day.
When you first take on maintenance of your county section, just place a copy of the central file in your area
and ask for a rebuild before you make any changes. Otherwise changes that may be made to the central copy could be lost.
The central copy is never changed once there is a devolved section.
The fields in the
places.csv file are as follows, but minor changes may occur during development of this
facility. Each field description contains the field name used in the MySQL database to identify
it clearly in any notes, and the size and type.
- CCC (3 char) -Chapman county code - upper case.
- Location - You can use either of the following two formats:
- GRIDREF (8 char) UK Ordnance Survey grid reference - full 8 characters. There are hints for using online maps to find exact locations. It is not possible to use Irish Ordnance Survey grid references as the scripts used to search and mainatin the gazetteer are written in Perl, and appropriate conversion routines for Irish OS grid references are not available.
- LAT (9 char) LON (9 char) Latitude and longitude - specified as a pair of comma separated numbers e.g.
"54.602699,-5.935707"In the actual database, the location uses an internal format based on the UK OS grid ref, which has been extended to cover the west of Ireland. This provides a single reference key, and a mechanism for selection. This is only visible on some urls. On all the display screens the location appears as a UK OS grid reference, or for all or Ireland, as latitude and longitude. If you need the reference key, use the link from the gaz script which gives a list of tabular results. On this look at the link to the gazetteer entry.
- If the location field is left blank (as a temporary measure!) the centre location of the county will be used when this entry gets incorporated into the database, and the APPROX field will have 'C' put in automatically regardless of what is in the csv file.
- APPROX (1 char) - Flag indicating whether the grid reference is exact or approximate. The original data source contained approximate grid references giving the kilometre square, but not the exact point within it. A 'Y' or 'Yes' indicates an approximate reference or a 'N'/'No' for an exact one. A 'C' can be used to note that the location specified is that of the centre of the county ( but in such cases it is better to leave the location field empty). A 'P' can be used to indicate that the location is that of the centre of the parish rather than the actual location of the place.
- PLACE (32 char) - The place name.
- PRIME (boolean) - A flag indicating whether this is the primary entry for the Town/Parish page in the subsequent URL. 'Yes' is used for the primary entry, 'No' for the rest. For each different URL in column G there must only be one place entry with this flag set to 'Yes'.
- MOREPLACE (32 char) - Additional comments about the location. There are frequently multiple places in a county with the same name, and this field can be used to help distinguish them. E.g there are at least 5 Broughtons in Lancashire, and we could include here 'near Preston'.
- URL (90 char) - The URL of the Town/Parish page covering the area where this place is. This is typically the historic parish or township this
place was in but things may have changed in modern times with the building of new towns etc. Nevertheless use this field to point to the page where
you will place information about this place.
Kain, R.J.P., Oliver, R.R., >Historic Parishes of England and Wales: an Electronic Map of Boundaries before 1850 with a Gazetteer and Metadata [computer file]. Colchester, Essex: History Data Service, UK Data Archive [distributor], 17 May 2001. SN: 4348.is a very good source of boundary information to help you decide which town/parish page to associate place names with.
- UNSPEC (boolean) - Alias flag. Some places have alternative names, e.g. English and Welsh names for the same place. Choose
a name that you want to be the first to appear (primary name) and create a normal gazetteer entry for it. For all the other
names create additional entries with the same gridref, but for these, set this flag to
Yes. For the alias entries field E (PRIME) will always be
No. This is the old technique for specifying aliases. It is much easier now to use the Alias field (Column N) rather than having separate entries for aliases.
- BARONY (32 char) - The name of the barony in which the place is located in Ireland. For England/Wales this can be used to hold the hundred or district for Scotland. For Ireland this field is used to link townlands to the relevant parish. As for most of the parishes we do not have any web pages the normal link via the URL field cannot be made. N.B. The name of the barony does not get displayed and so there is no requirement for multiple entries if it sits over a border.
- PARISH (32 char) - The name of the civil parish in which the place is located.
For Ireland this field is used to link townlands to the relevant parish. If parish boundaries run through a townland, put the others in here as well,
using a colon
:character as a separator. Do not put in space characters next to the colon. As we do not have any web pages for most of the Irish parishes the normal link via the URL field cannot be made.
- TYPE (32 char) - The type of place e.g. parish, townland, hamlet.
For Ireland all parishes should have the text
Parishin this field and townlands the text
Townlandas this is used to link townlands to their parishes when we have no URL for them.
- QUOTE (32 char) - The name of the file containing a quote describing the place. If present this quote will appear in gazetteer
entry web pages. It is planned to use the quotes extracted by Mel Lockie from Lewis's Topographical dictionaries and these are currently
/big/Gazeteer/quotes. This field just contains the name of the file, and not the directory in which it is held.
- Notes - This field will never get entered into the database, but is a place within the csv file to hold any notes that the maintainer may need particularly during developemnt of new place entries.
- Aliases - This does not become a database field, but is used by the database rebuild process to create additional entries with
the same contents as the current entry but with the aliaas as the place name, the Prime flag set to 'N' and the Unspec(Alias) flag
set to 'Y'. If there is more than one alias, use a
:as a separator in the list. Avoid leaving spaces at the start and end of alias names.
- FHS - The code(s) for the FHS(s) covering this Town/Parish.
- OTHERCCC - Some parishes and places have county boundaries running through them. This field helps handle these and can hold the county code(s), separAted by colon
:characters if there is more than other county.
- HIDEME - If a parish has a county boundary running through it and we have multiple entries, then this field can be used to hide the less important ones. If you code a
Ycharacter here then it won't be returned as the result of a general search. However if we only want the results for the county it is in, then it will be shown. To help identify the predominant entry, code an
Ncharacter for them. (
Nis the default) This technique can also be used for towns lying over county boundaries.
- ID - An identifier, unique within the county, for a town/parish. Each entry for which PRIME is set should have a unique ID that never changes so we can consistently refer to the town parish even if the url or location gets adjusted in the future. This is a character string and it is suggested that it be based on the name of the town/parish with additional characters where there are multiple ones with the same name.
The database has been contructed from source data that was based on post-1974 counties. So there are some additional sections that need to dealt with once you have started as local knowledge is required for some entries to determine which county some places are in. There may also be additional entries supplied by other people since a devolved copy of the county section was taken.
Providing additional informationHere are some tasks that county maintainers could undertake to improve the gazetteer. It may well be worthwhile recruiting a competent volunteer to do the bulk of the work but quality control procedures are likely to be needed, and cooperation to ensure the correct URLs are used on the entries.
- The initial entries contain only approximate locations, for England, Scotland, Wales and the Isle of Man they only
specifify the kilometre square on the map in which to find them. We need to find a more exact location for them, (and
change field C to
No) so that in our visual displays we get a much better picture. The statistics page has a link for each county, listing all the places with an approximate location, with links to find them on online maps.
Convert the approximate grid reference to absolute ones, and set the APPROX flag to 'No'. There are odd locations that are completely wrong so these do need fixing. The change from approximate to exact grid references is particularly useful for the links into online maps. Until then the placename isn't near the centre of the map on current maps, and frequently on the oldmaps site it is actually off screen as the map resolution is that much higher.
A useful technique to do this is ensure you have a Nearby Places link at the top of Town/Parish pages. Use that to access the places that have approximate locations ('~' shown on distance). Click on the grid reference to get into the online maps and use the point and click interfaces these provide to get exact locations. There is a minor disadvantage in that the place names on maps are placed in the nearest empty space so some judgement is required and hopefully some local knowledge. For large places such as towns the town centre is a good choice for the exact location. For smaller diverse places such as villages the location of the parish church is a useful alternative.
- In the initial file there will be a number of places that don't have a URL associated with
them. So just put one in for the appropriate Town/Parish page for the area where that place is. If
it is an actual primary Town/Parish page that wasn't guessed by the program which created the
initial data, then you will need to set the
Yes. Each place should have a url specified, which directs the user to a page which may contain information about it. Don't be confused into thinking we need a page for every place name in the gazetteer, the url will normally be that of the township/civil parish page covering the area in which this specific place is located. The statistics page has a link for each county, listing all the places without a URL field.
- Compare the database source with your list of Towns and Parishes. Add in any that aren't in the database.
- Add some more of the larger places that don't have a Town/Parish page of their own. Two places which you could use for suggestions (which are also likely to be the sorts of names users will search for) are the 1891 census database (search it on registration district names), and Brett Langston's places in registration districts.
- If you have a gazetteer, then add in entries for the additional places that you will have in it.
- Get out your maps, and go over the county section by section, adding in additional places.
The gazetteer started as a list of parishes in each pre-1974 county along with approximate grid references. Subsequently a large file of placenames with approximate grid references and post-1974 counties was obtained. A special program was written to compare each entry in this file with the parish database and choose the pre-1974 county. The technique used was to look for all places within 3 miles. If all were within the same pre-1974 county then that was chosen. If more than one county was found they were flagged as needing a manual choice as were any with no nearby parishes. Those for which a unique pre-1974 county was found were added into the database, and most entries are now in there. The rest are held in separate files for each county.
Access the files via the statistics page and copy and paste from there. The file entries contain hot links to the online mapping tools as an aid to processing their contents.
moreplaces.csv- Additional entries which have been identified by another county maintainer as being in your county since devolved maintenance was undertaken. Add these to your county's
places.csvfile. In order for the database rebuild to be able to remove entries taken from
moreplaces.csvan advanced database rebuild is required. If this is required ask Phil to perform one, and your
moreplaces.csvwill be eliminated or reduced in size.
checkplaces.csv- These are the places determined to be on the county borders which could be in more than one county. An entry for each borderline place appears in all the relevant county
checkplaces.csvfiles but with a Chapman code for the county heading under which each list is found. E.g. if the place could be in county A or county B, the county A list will contain the county A Chapman code, and the county B list will contain the county B Chapman code. All the rest of the fields are the same. There are also some extra fields at the end of the line listing which counties they could be in.
Each one needs to be examined and if it is in your county, trim the extra fields and add it to your
places.csvfile. Otherwise trim it and put the correct county code in it and pass it to Phil who will add it to the
moreplaces.csvfile for the correct county. Please remember to trim off the extra information and put in the correct county code before passing it on.
Other gazetteer settingsThere are a couple of entries in the county database entry which are used by some of the search routines:
- The name of the county town.
- It's grid reference.
- A grid reference for the physical centre of the county.
- Whether this county section of the gazetteer is being maintained and developed.
Updating Irish sectionsThe Irish information has come from a number of sources and needs some tidying up. The work is not complicated, just a steady check of the emtries to combine duplicates and adding locations to make the data more useful.
UsageA number of cgi scripts are available to access the gazetteer information in a number of ways.
- Add a link on each town/parish page to call the gazetteer and show nearby places in the button bar at the top of each page. The call to locate nearby places is via a link to the nearby cgi script with appropriate parameters.
- Under the Maps heading you can put a call to placemaps which shows a page with links to online maps for all the places you recorded as being within this town/parish.
- In your list of town/parish pages why not add in a call to gaz to enable users to work out the page for other places within your county. Put in a hidden CCC parameter to default to your county.
- Under the Gazetteers heading on town/parish pages why not add a call to howfar so users can find the distance to other places.
Controlling placement in the search results
nearbyand refers to the displays they produce.
The search results are designed so that the places it shows are sorted according to the distance from the start point, with all the places covered by an individual Town/Parish page grouped together with it. This is achieved primarily by sorting by distance, but also using information in various database fields as well.
The search results initially appear as two sections, the first with links to GENUKI pages, and then the rest. The second section are those entries in the database with an empty URL field. It is expected that over time, second section will completely disappear as URLs are aded to the existing entries.
The entries that are grouped under a Town/Parish page entry all have the URL of the Town/Parish page. The thing that distinguishes them as being subsidiary is the PRIME flag, which is set to 'Yes' just for the Town/Parish page entry.
The places that can appear at the start of the subsidiary group as just a list of place names
separated by commas without a distance and grid reference are defined as follows. They have the
same grid reference and URL as the Town/Parish page entry but they also have the unspecific
location flag set. This means that the place is somewhere within the Town/Parish but we don't know
or won't say exactly where it is. This can also be used for alternative names where places have
changed their name over time. E.g. Poulton le Sands is now called Morecambe. If you have an alternate
name or alias for a place create an identical entry to the primary, but put the alias in the place name field
and set the
UNSPEC/alias field to