Skip to content

Installation Prerequisites

Ondrej Kosarko edited this page Jun 19, 2024 · 15 revisions

This is the old installation guide for v5 based code, be sure you really want to be reading this and not NewInstallation

There are a few important prerequisites which should be prepared or answered before deployment:

Architecture of server/backups

LINDAT/CLARIN server is virtual which can be migrated between two independent servers. It's backed up on VM (virtual machine) level, OS level and DSpace level (backups of important directories in DSpace, backups of databases, and replication of AIPs).

OS

We are using Ubuntu LTS version.

Basic Software

Here's the software you'll need before you start installing our fork of dspace. If you need more verbose guide you can follow DSpace installation guide. This software is a fork of dspace, installing dspace itself is thus not required.

  • ant (>= 1.8.0 required)
  • postgresql (>=9.4 required)
  • jdk (>=1.8 required)
  • tomcat (>=7.0.50)
  • maven (>=3.0.5)
  • make
  • wget, xmllint, xsltproc, unzip (these are used in one some of the admin scripts)
  • apache/nginx

If you know what you're doing you can verify your setup with Prerequisites checklist or continue reading the sections below.

DSpace dependencies

Web server + servlet container

  • Additional information: Web Server
  • Required: Enabled https, ability to deploy java webapps, possibility of running perl/cgi

Postgres

  • Additional information: SUNScholar
  • Required: postgreSQL database cluster up and running, user with permissions to create/access databases

You might need to tweak pg_hba.conf and/or the number of connections, shmsize and other variables influencing the number of possible connections to the databases.

AAI

  • Additional information: Shibboleth
  • Required:
    • running shibboleth process
    • part of (at least) national federation
    • configuration including secured shib_test.pl

You'll be able to complete the repository software installation without it. But as a CLARIN centre you are obliged to provide federated authentication. Authentication is based on Shibboleth, which requires joining federation(s). LINDAT/CLARIN is part of eduID.cz (national federation), eduGAIN (through eduID.cz and following the DP-CoC), SPF. This can take some time and depends on the national federation requirements.

Handles

  • Strictly speaking this is not a prerequisite for the installation. A decision has to be made, which will influence the final setup. You can obtain the prefix and run the server after you've configured/tested the repository installation (but before going "production")
  • Additional information: Handle Server, PIDs
  • Required: own prefix, running handle server

DSpace assigns identification strings to objects which resemble the form of specific PIDs (persistent identifiers) namely handles. In order for these ids to become real PIDs, you need to have a correct prefix and a handle server listening for this prefix which will resolve the ids to urls. There are several options how to do it and LINDAT/CLARIN supports obtaining handles from EPIC (API v1/v2) or using DSpace handle server integration (the server will delegate work to DSpace).

In order to obtain your own prefix you should buy (register) a prefix (http://www.handle.net/), follow the instructions specific for DSpace sent by the handle.net, set the prefix in dspace.cfg and start the handle server bundled with DSpace. You can verify the setup by going to http://hdl.handle.net/YOUR-PREFIX/something and checking the log file handle-plugin.log which should contain the request. Your own prefix can be also hosted by EPIC using the APIv2.

Note: Using a shared prefix is very simple but a bad idea in the long run. You cannot migrate your PIDs from one handle server to another if you use a shared prefix (you cannot just take a bunch of PIDs from a prefix).

Mail server

  • Additional information: SUNScholar
  • Required: running mail server

DSpace relies on a working mail server for sending reports, alerts, verification and exception logs. LINDAT/CLARIN uses mailing-lists for receiving these emails.

Google Analytics and PIWIK

DSpace has own internal statistics but we extended DSpace with private Google Analytics and PIWIK integration. In order to use it, obtain your own UA string and a developer access (api key file) for GA and auth token with idSite for PIWIK.

We are explicitly tracking OAI-PMH and bitstream downloads for PIWIK.

Clone this wiki locally