Hide

GENUKI Gazetteer maintenance

hide
Hide

THIS INFORMATION IS NOW SOMEWHAT OUT OF DATE see the Gazetteer Guide at https://www.genuki.org.uk/maintainers/GazetteerGuide

This information is for GENUKI county maintainers and documents how to integrate the gazetteer into their pages, and develop and maintain their county sections. Unfortunately we have no information for the Channel Islands.

Source files

The searches are all performed on a MySQL database using SQL queries. Changes are not made to the actual database itself, but to a set of source files which are used to replace the entire database when the need arises.

The source files are actually just a set of comma separated text files places.csv, with one file per county. The master files are in the usual place, contact Phil Stringer via the link at the bottom of the page if you can't remember where that is. There are no links into these files to make it difficult for search engines to find them, and to prevent them being harvested by somebody else. The basic file is named places.csv and when a county section is being maintained by the county maintainer, this file is placed in the standard place in your county web pages. Again contact Phil if you can't remember where that is. A special program is used to collect all the county sections and build the database from source. When any section has been changed, a database rebuild is required to make you change live. A simple rebuild is performed at 6:00 am every day, so you should see your changes live the next day. When you first take on maintenance of your county section, just place a copy of the central file in your area and ask for a rebuild before you make any changes. Otherwise changes that may be made to the central copy could be lost. The central copy is never changed once there is a devolved section.

The fields in the places.csv file are as follows, but minor changes may occur during development of this facility. Each field description contains the field name used in the MySQL database to identify it clearly in any notes, and the size and type.

  1. CCC (3 char) -Chapman county code - upper case.
  2. Location - You can use either of the following two formats:
    • GRIDREF (8 char) UK Ordnance Survey grid reference - full 8 characters. There are hints for using online maps to find exact locations. It is not possible to use Irish Ordnance Survey grid references as the scripts used to search and mainatin the gazetteer are written in Perl, and appropriate conversion routines for Irish OS grid references are not available.
    • LAT (9 char) LON (9 char) Latitude and longitude - specified as a pair of comma separated numbers e.g. "54.602699,-5.935707" In the actual database, the location uses an internal format based on the UK OS grid ref, which has been extended to cover the west of Ireland. This provides a single reference key, and a mechanism for selection. This is only visible on some urls. On all the display screens the location appears as a UK OS grid reference, or for all or Ireland, as latitude and longitude. If you need the reference key, use the link from the gaz script which gives a list of tabular results. On this look at the link to the gazetteer entry.
    • If the location field is left blank (as a temporary measure!) the centre location of the county will be used when this entry gets incorporated into the database, and the APPROX field will have 'C' put in automatically regardless of what is in the csv file.
  3. APPROX (1 char) - Flag indicating whether the grid reference is exact or approximate. The original data source contained approximate grid references giving the kilometre square, but not the exact point within it. A 'Y' or 'Yes' indicates an approximate reference or a 'N'/'No' for an exact one. A 'C' can be used to note that the location specified is that of the centre of the county ( but in such cases it is better to leave the location field empty). A 'P' can be used to indicate that the location is that of the centre of the parish rather than the actual location of the place.
  4. PLACE (32 char) - The place name.
  5. PRIME (boolean) - A flag indicating whether this is the primary entry for the Town/Parish page in the subsequent URL. 'Yes' is used for the primary entry, 'No' for the rest. For each different URL in column G there must only be one place entry with this flag set to 'Yes'.
  6. MOREPLACE (32 char) - Additional comments about the location. There are frequently multiple places in a county with the same name, and this field can be used to help distinguish them. E.g there are at least 5 Broughtons in Lancashire, and we could include here 'near Preston'.
  7. URL (90 char) - The URL of the Town/Parish page covering the area where this place is. This is typically the historic parish or township this place was in but things may have changed in modern times with the building of new towns etc. Nevertheless use this field to point to the page where you will place information about this place.
    Kain, R.J.P., Oliver, R.R., Historic Parishes of England and Wales: an Electronic Map of Boundaries before 1850 with a Gazetteer and Metadata [computer file]. Colchester, Essex: History Data Service, UK Data Archive [distributor], 17 May 2001. SN: 4348.
    is a very good source of boundary information to help you decide which town/parish page to associate place names with.
  8. UNSPEC (boolean) - Alias flag. Some places have alternative names, e.g. English and Welsh names for the same place. Choose a name that you want to be the first to appear (primary name) and create a normal gazetteer entry for it. For all the other names create additional entries with the same gridref, but for these, set this flag to Yes. For the alias entries field E (PRIME) will always be No. This is the old technique for specifying aliases. It is much easier now to use the Alias field (Column N) rather than having separate entries for aliases. Set it to Yes to indicate this. This is used in the formatting of the output to prevent a distance or URL being given alongside this name. The name just appears in a comma separated list under the main Town/Parish heading. You can use it if you don't know the actual location, or if you simply want to group these place names as alternatives under the relevant Town/Parish. -->
  9. BARONY (32 char) - The name of the barony in which the place is located in Ireland. For England/Wales this can be used to hold the hundred or district for Scotland. For Ireland this field is used to link townlands to the relevant parish. As for most of the parishes we do not have any web pages the normal link via the URL field cannot be made. N.B. The name of the barony does not get displayed and so there is no requirement for multiple entries if it sits over a border. At the moment this field is not used by any of the scripts that display gazetteer entries. It is primarily there to help identify entries for which we do not yet have a location. -->
  10. PARISH (32 char) - The name of the civil parish in which the place is located. For Ireland this field is used to link townlands to the relevant parish. If parish boundaries run through a townland, put the others in here as well, using a colon : character as a separator. Do not put in space characters next to the colon. As we do not have any web pages for most of the Irish parishes the normal link via the URL field cannot be made. At the moment this field is not used by any of the scripts that display gazetteer entries. It is primarily there to help identify entries for which we do not yet have a location. -->
  11. TYPE (32 char) - The type of place e.g. parish, townland, hamlet. For Ireland all parishes should have the text Parish in this field and townlands the text Townland as this is used to link townlands to their parishes when we have no URL for them. At the moment this field is not used by any of the scripts that display gazetteer entries. It is primarily there to help identify entries for which we do not yet have a location. -->
  12. QUOTE (32 char) - The name of the file containing a quote describing the place. If present this quote will appear in gazetteer entry web pages. It is planned to use the quotes extracted by Mel Lockie from Lewis's Topographical dictionaries and these are currently stored at /big/Gazeteer/quotes. This field just contains the name of the file, and not the directory in which it is held.
  13. Notes - This field will never get entered into the database, but is a place within the csv file to hold any notes that the maintainer may need particularly during developemnt of new place entries.
  14. Aliases - This does not become a database field, but is used by the database rebuild process to create additional entries with the same contents as the current entry but with the aliaas as the place name, the Prime flag set to 'N' and the Unspec(Alias) flag set to 'Y'. If there is more than one alias, use a : as a separator in the list. Avoid leaving spaces at the start and end of alias names.
  15. FHS - The code(s) for the FHS(s) covering this Town/Parish.
  16. OTHERCCC - Some parishes and places have county boundaries running through them. This field helps handle these and can hold the county code(s), separated by colon : characters if there is more than other county.
  17. HIDEME - If a parish has a county boundary running through it and we have multiple entries, then this field can be used to hide the less important ones. If you code a Y character here then it won't be returned as the result of a general search. However if we only want the results for the county it is in, then it will be shown. To help identify the predominant entry, code an N character for them. (N is the default) This technique can also be used for towns lying over county boundaries.
  18. ID - An identifier, unique within the county, for a town/parish. Each entry for which PRIME is set should have a unique ID that never changes so we can consistently refer to the town parish even if the url or location gets adjusted in the future. This is a character string and it is suggested that it be based on the name of the town/parish with additional characters where there are multiple ones with the same name.

    We plan to use this as a unique key in scripts etc. to identify a place and its database entry. So it needs to be something easy to remember, and not too long. Don't put in spaces or any odd characters that may cause problems when we use it as a parameter to a script. In the database it will have the county code as a prefix to make it unique, but let you have the flexibility to choose your own values. You don't need to put the county code in the csv file, the upload will add it in for you. And it will NOT be used to identify the county, it is just to make it it easier to choose a unique value.

    So some examples might be.

    • Lytham
    • BrougtonP for the Broughton near Preston.
    • BroughtonS for the Broughton near Salford.
    • BroughtonF for Broughton-in-Furness, It doesn't have to be the full place name, just a unique code that can be guessed quite easily.

    Now is the time to choose sensible values. At some stage we will have to automate the choice and after that there will not be an opportunity to make any changes.

The database has been contructed from source data that was based on post-1974 counties. So there are some additional sections that need to dealt with once you have started as local knowledge is required for some entries to determine which county some places are in. There may also be additional entries supplied by other people since a devolved copy of the county section was taken.

Providing additional information

Here are some tasks that county maintainers could undertake to improve the gazetteer. It may well be worthwhile recruiting a competent volunteer to do the bulk of the work but quality control procedures are likely to be needed, and cooperation to ensure the correct URLs are used on the entries.
  • The initial entries contain only approximate locations, for England, Scotland, Wales and the Isle of Man they only specifify the kilometre square on the map in which to find them. We need to find a more exact location for them, (and change field C to No) so that in our visual displays we get a much better picture. The statistics page has a link for each county, listing all the places with an approximate location, with links to find them on online maps.

    Convert the approximate grid reference to absolute ones, and set the APPROX flag to 'No'. There are odd locations that are completely wrong so these do need fixing. The change from approximate to exact grid references is particularly useful for the links into online maps. Until then the placename isn't near the centre of the map on current maps, and frequently on the oldmaps site it is actually off screen as the map resolution is that much higher.

    A useful technique to do this is ensure you have a Nearby Places link at the top of Town/Parish pages. Use that to access the places that have approximate locations ('~' shown on distance). Click on the grid reference to get into the online maps and use the point and click interfaces these provide to get exact locations. There is a minor disadvantage in that the place names on maps are placed in the nearest empty space so some judgement is required and hopefully some local knowledge. For large places such as towns the town centre is a good choice for the exact location. For smaller diverse places such as villages the location of the parish church is a useful alternative.

  • In the initial file there will be a number of places that don't have a URL associated with them. So just put one in for the appropriate Town/Parish page for the area where that place is. If it is an actual primary Town/Parish page that wasn't guessed by the program which created the initial data, then you will need to set the PRIME flag to Yes. Each place should have a url specified, which directs the user to a page which may contain information about it. Don't be confused into thinking we need a page for every place name in the gazetteer, the url will normally be that of the township/civil parish page covering the area in which this specific place is located. The statistics page has a link for each county, listing all the places without a URL field.
  • Compare the database source with your list of Towns and Parishes. Add in any that aren't in the database.
  • Add some more of the larger places that don't have a Town/Parish page of their own. Two places which you could use for suggestions (which are also likely to be the sorts of names users will search for) are the 1891 census database (search it on registration district names), and Brett Langston's places in registration districts.
  • If you have a gazetteer, then add in entries for the additional places that you will have in it.
  • Get out your maps, and go over the county section by section, adding in additional places.

Additional source files

Take a look at the statistics page to see if there are additional entries that need adding to a county section or which need evaluating to see if they are in that county.

The gazetteer started as a list of parishes in each pre-1974 county along with approximate grid references. Subsequently a large file of placenames with approximate grid references and post-1974 counties was obtained. A special program was written to compare each entry in this file with the parish database and choose the pre-1974 county. The technique used was to look for all places within 3 miles. If all were within the same pre-1974 county then that was chosen. If more than one county was found they were flagged as needing a manual choice as were any with no nearby parishes. Those for which a unique pre-1974 county was found were added into the database, and most entries are now in there. The rest are held in separate files for each county.

Access the files via the statistics page and copy and paste from there. The file entries contain hot links to the online mapping tools as an aid to processing their contents.

  • moreplaces.csv - Additional entries which have been identified by another county maintainer as being in your county since devolved maintenance was undertaken. Add these to your county's places.csv file. In order for the database rebuild to be able to remove entries taken from moreplaces.csv an advanced database rebuild is required. If this is required ask Phil to perform one, and your moreplaces.csv will be eliminated or reduced in size.
  • checkplaces.csv - These are the places determined to be on the county borders which could be in more than one county. An entry for each borderline place appears in all the relevant county checkplaces.csv files but with a Chapman code for the county heading under which each list is found. E.g. if the place could be in county A or county B, the county A list will contain the county A Chapman code, and the county B list will contain the county B Chapman code. All the rest of the fields are the same. There are also some extra fields at the end of the line listing which counties they could be in.

    Each one needs to be examined and if it is in your county, trim the extra fields and add it to your places.csv file. Otherwise trim it and put the correct county code in it and pass it to Phil who will add it to the moreplaces.csv file for the correct county. Please remember to trim off the extra information and put in the correct county code before passing it on.

Other gazetteer settings

There are a couple of entries in the county database entry which are used by some of the search routines:
  • The name of the county town.
  • It's grid reference.
  • A grid reference for the physical centre of the county.
  • Whether this county section of the gazetteer is being maintained and developed.

Updating Irish sections

The Irish information has come from a number of sources and needs some tidying up. The work is not complicated, just a steady check of the emtries to combine duplicates and adding locations to make the data more useful.

Boundaries

We have found some sources of KML data which can be used to plot boundary information on maps.