19/05/2009

This is a preliminary description of how to install Connotea under OS X. At some point I’ll roll this into the Connotea code, in the meantime please send feedback to i.mulvany@nature.com.

In February 2009 a mac security update broke my perl installation. You can read more about that on this Slashdot article, ulimatly forcing me to reinstall my entire system. One of the main reasons for this is that I had a hevily customized perl setup as part of installing Connotea and I based this on top of the system perl. This was a bad idea, but was the quickest way to get Connotea up and running on my local machine at the time.

Having learnt my lesson I decided to set up Connotea on a totally sepearte perl install. The following instructions assume a totally clean machine to begin with. You will need to have the Apple Developer Tools installed.

For installing on a clean *nix please follow the instructions in the main README file.

Connotea is written in perl and uses mod_perl under apache to talk to a MySQL datastore. The code uses memcached to help with preformance. It uses a large number of perl modules that need to be installed before the software can run. This guide will run through setting up all of the required depedancies.

Before getting to installing Connotea let’s set up our dependancies.

A word on MacPorts vs compiling your own.

Bauhaus has a very nice description on how to set up a dev environment in OS X, however the description here depends hevily on MacPorts. MacPorts is an easy way to install software onto your mac, and it will do so in a way that will not conflict with system installs, however it provides the wrong architecture for perl for Connotea, so you will have to install perl and all of the perl dependancies seperatly. This is a bit of a shame, as installing the dependancies is much faster under MacPorts than it is under CPAN. We can install some of the required architecture under MacPorts and that is what we will do. One last note about MacPorts. It will install a very large amount of dependancies onto your system. This is expected, though scary, behaviour. Don’t worry.

In order to use MacPorts you can install an application. After installing you can get the port version of a given piece of software with hte following command:

$ sudo port install port-name

where port-name is the software you want to install. MacPorts creates it’s own path system and will install into

$ /opt

MySQL

Download the package from the MySQL site and follow the instructions for installation. For a quick cheat sheet of common commands for MySql under OS X there is a great MySQL Reference for OS X over at Comentum.

After installing the most important commands are to set an admin password, to start the server and to stop the server.

setting the root password:

$ /usr/local/mysql/bin/mysqladmin -u root password sniggle

starting and stopping from the command line:

$ sudo /usr/local/mysql/bin/mysqld_safe --user=mysql &

If you wish to set preferennces for MySQL when it is running you can create a conf file. This should be under /etc and should be called my.cnf.

Make sure you know where your socks are.

There is a known issue with MySQL under OS X in that the location for the socket file is not where OS X thinks it is. There are a number of solutions to this.
Edit the my.cnf file so that the mysql server creates a sock file in a location of your choosing. Reccomended like so:

[mysqld]
socket=/tmp/mysql.sock

Now if you are using a perl installed via MacPorts and you want to use a non-mac ports version of mysql then you might encounter the following problem with the DBI module.

$ failed: Can't connect to local MySQL server through socket '/opt/local/var/run/mysql5/mysqld.sock'

What is going on here is that the DBI module is reading a setting telling it that the mysql socket should be found in ‘/opt/local/var/run/mysql5/mysqld.sock’. I’m not sure where it gets this setting from but there are a couple of options.

Use a mac ports version of mysql. See
Specifically tell the DBI module where to look for the socect. In any perl scripts that call the DBI module you can use

the following command to specify the location of the socket file:

$ENV{MYSQL_UNIX_PORT}='/tmp/mysql.sock';

Place this in your script after you call the DBI module.

You can alias your actual sock file to the location that the DBI module is expecting it, via:
$ sudo ln -s /tmp/mysql.sock /opt/local/var/run/mysql5/mysqld.sock

you may have to create the /opt/local/var/run/mysql/ directory.

You may wish to create an alias so that this link does not get removed on server start up.

Apache2

This is where MacPorts comes in handy. To install apache2 all you need to do is

$sudo port install apache2

This will install a number of dependancies on your system, and it will install apace2 into /aop/local/apace2. You can start apache2 and ensure that it starts at next startup with the follwoing command:

$sudo launchctl load -w /Library/LaunchDaemons/org.macports.apache2.plist

You can test that it works by pointing your browser at http://localhost/ and you should get an “It Works!” message!

If you want to start and stop apache2 while working on the configuration of the server, or to check your sanity then you can use the following cammands:

$ sudo /opt/local/apache2/bin/apachectl stop
$ sudo /opt/local/apache2/bin/apachectl start

memcached

Using the Macports version this is pretty easy.

$ sudo port install memcached

A few words about CPAN

There is a comprehensive description of life under CPAN on OSX however I decided to wing it. I dislike CPAN. It is a very attention demanding program that has a tendancy to want a lot of attention. It is nice to know that you can install modules from the command line with the following syntax:

$ sudo cpan -i -f URI::QueryParam 

This means if you can get a list of dependancies together then you can semi-automate the process of install them all. I understand that the preferred method of managing a large selection of perl dependancies to is create a CPAN Bundle, but I didn’t manage to do this, mainly due to a lack of familliarity with CPANishness. It is also worth noting that if CPAN is not able to install a module you can always have a go at downloading and installing the modeule from source. This is often quite straightforward.

Perl

As mentioned earlier, MacPorts do not porvide the correct type of perl for running Connotea. Connotea requires ‘PerlOptions +Parent’ in the Apache configuration file. This requires a multithreaded perl. MacPorts only provides a perl without threads. This means you have to build your own perl. You can check which version of perl you have installed with the following command

$ perl -V

What you want to end up with is the following:

Platform:
    osname=darwin, osvers=9.6.0, archname=darwin-thread-multi-2level

Go ahead and grab a copy of Perl. You pretty much want to follow the instructions given in the README.macosx file. You want to make sure that you build perl with multithreaded support. I followed a lot of instructions for passing arguments to the Configure script without much success. In the end I think that my mistake was passing the ‘-de’ flag which tells the script to accept the default settings. Don’t do this, just go through the interrogation and at some point you will be asked if you want to build with multithreaded support. Say YES!!. Hopefully you will end up with a perl now installed in /usr/local/bin/perl. This will be independant from the system perl and you will no longer be sucseptible to Apple fucking you over (allbeit unintentionally). Now when you build anything on top of this perl it will have the same architectural support (I believe), so you can now make mod_perl on top of this perl. One tip for speeding up the process, if you have a multi-core machine you can pass an argument to make which will tell it to use the number of cores that you pass it to, so

$ make -j 2

will make twice as fast as make.

mod_perl

Get a copy of mod_perl. Go into your mod_perl source directory and from there do the following:

$ perl Makefile.PL MP_APXS=/opt/local/apache2/bin/apxs
$ make && make test

after a lengthy build process you should be ready to install with:

$ sudo make install

In /opt/local/apache2/modules you should now find mod_perl.so. We can now test our mod_perl to see if everyting is OK. Have a look at the fast guide to getting started with mod_perl and follow the instructions there.

To set up a test of mod_perl we need to edit apache’s httpd.conf file which can be found: /opt/local/apache2/conf/httpd.conf

Add the mod_perl module by adding the following line to the conf file

LoadModule perl_module modules/mod_perl.so

Stop and restart your apache2 server and take a peek at the error log, you should get a line like the following:

$tail /opt/local/apache2/logs/
[Thu Apr 23 14:59:22 2009] [notice] Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8j DAV/2 mod_perl/2.0.4 Perl/v5.8.9 configured -- resuming normal operations

yay!

perl dependancies

OK, Connotea requies a lot of dependancies, most of which can be installed through CPAN, a few which need to be manually installed. I’ve provided a file 1.8.extended.deplist.txt that contains what I believe to be the current set of modules that Connotea requires. If you pass the list of names into check-modules.py then you can check whether these modules have been installed. Passing any argument into the script will produce verobse output. Passing no arguments will just list the modules that have not been installed.

$ more 1.8.extended.deplist.txt | ./check-modules.py verbose 

For the time being please ignore the following modules:

Apache2
Apache::Const
Apache::File
File::Touch
Net::OpenID::JanRain::Consumer

You can generate a shell script to install non-installed modules via CPAN with the following command:

$ more 1.8.extended.deplist.txt | ./check-modules.py | awk '{print "sudo cpan -i " $1}' > install-modules.sh

and then:

$chmod u+x install-modules.sh
$sudo ./install-modules.sh

This will take some time. The CPAN interface is slow and requires a lot of intervention. If you get stuck with CPAN for a given module, then download the module and install from source.

I had to manually install Authen::Captcha which is available from http://search.cpan.org/dist/Authen-Captcha/Captcha.pl

Connotea code

The most recent public version of the Connotea code is now on GitHub under the name connotea-public. The main working branch is still under darcs, however we are in the process of switching over to git as the time to upload local patches to the dev server is becomming prohibitive. We may move to mercurial at some point in time, but for the time being the public facing snapshot will remain in git. Head on over and grab a copy. The main files of interest for continuing the setup will be in /connotea-public/sql/, /connotea-public/README, and /connotea-pulic/config

Setup MySQL databases and permissions

Connotea uses a four databases. The main content database, the search database which is a replica of the content database, a click tracking database and the wiki database. The content database is a MyISAM database, and the search database is an InnoDB database. The schema for the content database is provided by schema.sql in the sql directory. To generate the search database do the following:

$ perl mkschema_search  < schema.sql > schema_search.sql

Now set up the schema of the content database and the search database:

$mysql -u root -p < schema.sql
$mysql -u root -p < schema_search.sql
$mysql -u root -p < clicks.sql

You need to set up the search database to be a slave of the content database. This is done in the MySQL conf file by adding the following directives:

  [mysqld]
  # local replication of bibliotech to bibliotech_search:
  server-id=1
  log-bin=mysql-bin
  binlog-do-db=bibliotech
  replicate-same-server-id=1
  replicate-rewrite-db=bibliotech->bibliotech_search
  replicate-do-db=bibliotech_search
  master-host=localhost
  master-user=search_repl
  master-password=pass

You may want to add the folloing directive too:

  # change stopwords in support of bibliotech freematch feature:
  #ft_stopword_file=/etc/mysql_stopwords.txt
  ft_min_word_len=2
  ft_max_word_len=255
  # allow packing of queries
  group_concat_max_len=8192

You now need to set up the appropriate permissions form within MySQL for the following users ‘conwiki’, ‘search_repl’ and ‘connotea’:

mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, CREATE TEMPORARY TABLES, ALTER ON bibliotech_search.* TO 
    -> connotea@localhost identified by 'secret';

mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, CREATE TEMPORARY TABLES, ALTER ON bibliotech.* TO 
    -> connotea@localhost identified by 'secret';

mysql> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO
    ->      search_repl@'localhost.localdomain' IDENTIFIED BY 'secret';

We now need to set up the wiki database. This is done through the following command:

mysql>CREATE DATABASE conwiki;
mysql>GRANT ALL ON conwiki.* TO conwiki@localhost IDENTIFIED BY 'secret';

To set up the schema we just let wiki-toolkit-setupdb have at it:

$/usr/local/bin/wiki-toolkit-setupdb --type mysql \
                                      --name conwiki \
                                      --user conwiki \
                                      --pass secret \
                                      --host localhost

Remember to populate the “COMPONENT WIKI” block of your configuration file with the wiki database details.

Connotea will be informed of the database that it needs to connect to in the conf file. I would say that it goes without saying that the passwords all need to be congruent, but I keep messing up that part, so I’m not going to say that.

Setup Apache directives and paths to the code.

Ok, now we have mysql sitting there waiting to act as a data store. We have the connotea code. We just need to connect these together and to the outside world through the magic of apace and mod_perl.

You need to add a directive like the following to your apace conf file. Remeber, that file should be in

/opt/local/apache2/conf/httpd.conf

# change the document root, as we are going to host connotea from 
# a more traditional location
#DocumentRoot "/opt/local/apache2/htdocs"
DocumentRoot "/var/www/html"

# ensure that we can follow symlinks:
<Directory "/var/www/html">
    #
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.2/mod/core.html#options
    # for more information.
    #
    Options Indexes FollowSymLinks

    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   Options FileInfo AuthConfig Limit
    #
    AllowOverride None

    #
    # Controls who can get stuff from this server.
    #
    Order allow,deny
    Allow from all

</Directory>

# add directives for connotea
<VirtualHost 127.0.0.1>
        ServerName local.connotea.com
        ServerAlias *local.connotea.com
        ServerAdmin ian@mulvany.net
        DocumentRoot /var/www/html/connotea
        PerlOptions +Parent
        PerlSwitches -I/var/www/perl/connotea
        PerlModule Bibliotech::Apache
        PerlModule Bibliotech::AuthCookie
        <Location />
          SetHandler perl-script
          PerlHandler Bibliotech::Apache
          PerlAuthenHandler Bibliotech::AuthCookie::authen_handler
          AuthName Bibliotech
          AuthType basic
          require valid-user
          #ErrorDocument 503 /paused.html
          #ErrorDocument 503 /readonly.html
          ErrorDocument 503 /unavailable.html
        </Location>
</VirtualHost>

In your /etc/hosts file add the following line:

127.0.0.1 local.connotea.com

This means you can point your browser at local.connotea.com and see the local installation. in /var/www create perl, html and site. In /var/www/site create the directory connotea and place the source code in there. create the following symlinks:

/var/www/perl/connotea -> ../site/connotea

and

/var/www/html/connotea -> ../perl/connotea/site/npg

Make sure that the paths to the code is readable by apache. That also means that the full paths have to be readable by apache.

Set up the connotea conf file.

copy /connotea-code/conf to /etc/bibliotech.conf

The most important declarations to set up are the following:

DOCROOT = '/var/www/perl/connotea_code/site/default'
LOCATION = 'http://www.mydomain.com/'
DBI_CONNECT = 'dbi:mysql:bibliotech'
DBI_USERNAME = 'user'
DBI_PASSWORD = 'secret'
# just the database name of the replicated MyISAM FULLTEXT-enabled database
DBI_SEARCH = 'bibliotech_search'
# for debugging purposes set EXPLAIN_HTTP_CODES to true
# set it to false when you have everything running.
# It causes application errors to be printed to the requesting page.
EXPLAIN_HTTP_CODES = true 

CLICKS {
  # options regarding click tracking
  # database connection details
  DBI_CONNECT = 'dbi:mysql:clicks'
  DBI_USERNAME = 'user'
  DBI_PASSWORD = 'secret'
}

COMPONENT WIKI {
  #DBI_CONNECT = 'dbi:mysql:conwiki'
  #DBI_USERNAME = 'conwiki'
  #DBI_PASSWORD = 'secret'
  #ADMIN_USERS = [ 'admin' ]
  #LOCK_TIME = '10 MINUTE'
  #ALLOW_EDIT = true
  #HOME_NODE = 'System:Home'
  # page size limit is in characters
  #MAX_PAGE_SIZE = 40000
  # maximum external hyperlink count, cuts down on spam
  #MAX_EXT_LINKS = 75
  # scan: 1 means check text against ANTISPAM > TAG_REALLY_BAD_PHRASE_LIST
  # scan: 2 means check text against ANTISPAM > WIKI_BAD_PHRASE_LIST+TAG_REALLY_BAD_PHRASE_LIST
  #SCAN = 1
  # to admit the spam rule that rejects wiki text, set this to true:
  #SAY_SPAM_RULE = false
}

# sendmail needs to be set to the OS X path.
SENDMAIL                   = '/usr/sbin/sendmail'

Final adjustments

Ensure that antispam_score.csv is writable by connotea.

Turn it on.

You should be able to run the application now.

Potential Gotchas

If you are getting a ‘Undefined subroutine &LWP::Debug::trace’ error then you need to downgrage your installed version of LWP. There is a good description of the problem with fix. You can use check-module-version.pl to see what version of a given module you have installed.

You can use the following script to check which version of a perl module you have installed.

In a previous installaion I had problems getting the perl GD module installed on my mac. If you are encoutering problems then there is a good set of instructions for installing GD on OS X.