Hi everybody

at 03 December 2007

Most of you have probably seen me around the halls by now. My name is Jared O'Connell and I started with the Statistics & Decision Analysis group two weeks ago.

My background is Applied Statistics and Computer Science, which were my majors at the University of Western Australia. After finishing my bachelor's degree, I worked for three years at a government science body, the CSIRO, performing satellite imagery analysis (link). This work involved the analysis of large amounts of data using classification, robust regression, HMMs and parallel computing to make it all run fast enough!

I now will be working within the ILSORM project, looking at the relationship between progesterone, activity levels and reproductive status in cows. I look forward to working with you all.

Course in Generalized Linear Models with Biological Applications

at 30 November 2007

In spring 2008 (start: 25. February) our reserach group offers a course in generalized linear models with biological applications.

The course introduces the modeling (regression analysis) of non-normal observations as for example counts and proportions. Methods to handle correlated observations will also be introduced. The lectures are accompanied by computing exercises using the statistical package ‘R’. Due to its versatility and graphical capacities it has something to offer even to the experienced SAS user.

The course is open to everyone with basic statistical knowledge. The course is free of charge for PhD students. It is also free for students and employees affiliated to Aarhus University. The course is approved as a PhD course (10 ECTS points) at Aarhus University.

For additional information please check the course homepage

Best regards

Ulrich Halekoh

New versions of Windows Live Writer and JabRef

at 20 November 2007

Lars has previously blogged about Windows Live Writer and JabRef. Both programs have now been updated. Visit Writer Zone: Windows Live Writer: Out of Beta or the JabRef page at SourceForge to download the new versions.

How to reconfigure a keyboard

at 15 November 2007

Recently we held an the introductory R-course. In R you need to use the tilde (~) sign often. Everytime you specify a statistical formula, you write something like:



Y ~ Treatment + X



One of the participants had a keyboard on her (italian) laptop that had no '~' sign. As one of the participants could remember, that is no problem. You simply type the ASCII-code of '~' (126) on the numeric keypad while you hold down the 'Alt' key. And of course, on a laptop without a numeric keyboard you simply find the blue 'fn' key and the blue 'num lock' key and push them down simultaneously, then you type the ASCII code with the Alt key pressed, and then the blue 'fn' key and the blue 'num lock' key again. (If you forget the final part, 5t 5s act4a33y n4 *r6b3e0, s60e 6f the 2eys are s50*3y re*3aced w5th n40bers, oops - I meant to say, it is actually no problem, some of the keys are simply replaced with numbers).


The R-Help list had some discussion concerning this problem in 2004, where the solution above was suggested.

However, the satisfactory solution is to remap the keyboard. Microsoft has a utility program for doing this. You may download it from the Microsoft Keyboard Layout Creator. For Windows XP you probably need the old version (1.3), while Vista users can use version 1.4.


The instructions in the help menu are easy to follow. I'll just show a couple of screen dumps. First of all I load an existing keyboard via the File | Load existing keyboard option. I choose the Danish keyboard. The program then shows an image of the existing keyboard layout.





By selecting the different shift states, you may see the key mapping in the corresponding states. As an example, the AltGr shift state below.





In fact I have already changed the keyboard mapping. As you an see the keyboard now shows a '~' sign corresponding to the 'N' key. In order to obtain this mapping I simply clicked on the key in the screen image. A popup window allowed me to type the '~'-sign and he redefinition was made. (If your keyboard does not have the '~' sign you need to use the trick with the numeric keyboard and Alt 126 as described above).

Finally, you choose Project| Build DLL and setup Package, as described in Help file of the program. This produces an msi-file that can be executed and your new keyboard is installed.



So now I have a keyboard that produces a a tilde sign when I press AltGr+N. In contrast to the existing one (next to the Enter key in the right hand side) this one is not a dead key. I do not need to wait for another keypress to see the '~'sign.


Another option is to use e.g. AutoHotkey.

How to import SAS data set into R

at 07 November 2007

The import of data into R is still a bit complicated if the original data are stored in the SAS-data format. Though solutions have existed for a long time, see e.g. the R-documentation concerning Importing from other statistical systems, it is complicated to use these facilities. They are based on functions which need to call the SAS-program from R. The settings for doing this do not always correspond to the settings on the computers we use.

I have written a short note/web-page with focus on the export problem from the SAS point of view. The note compares three different intermediate formats: csv, xls and xport format.

In normal use, the route via the xls format seems best. Data are exported via SAS code similar to:

 
libname sasdata "C:/SASImport";
libname out Excel
"C:/SASImport/file798b12e1.xls";
data out.rimporttest;
set sasdata.rimporttest;
run;

and imported into R via the read.xls command in the xlsReadWrite package.


Further detail in the note, which also includes R-functions. The note is also available as a pdf-file

AMORPH – Agricultural practise – mobility, availability and retention of phosphorus in soils

at 08 August 2007

The project will focus on the potentially mobile dissolved and colloidal P covering a broad range of agricultural soil types and tyr to investigate how aspects of agricultural practices influence the potential P mobility.

Our group will mainly be involved in analysing the plant available P (Olsen P) in order to describe how the plant available P depends on explanatory variables.

The project is lead by Gitte Rubæk (Department of Agroecology and environment)

NLES4 – Nitrogen Leaching from agricultural fields



The purpose of the project is to improve the existing NLES3 model for prediction of nitrogen leaching under different growing conditions

Our group will be responsible for the statistical analyses of leaching recorded in trials and on private farms.

The project is lead by Uffe Jørgensen (Department of Agroecology and environment)

IMPACT – Impacts and adoption to climate change in cropping systems



The purpose of the project is to analyse likely impacts of climate change on arable cropping systems in Denmark with particular reference to drop quality, N cycling, nitrate leaching and changes in occurrence of pests and diseases.

Our group will mainly be involved in the analyses of existing long term trial in order to evaluate the effect of climate on crop yield.

The project is lead by Jørgen E. Olesen (Department of Agroecology and environment)

Wink screensnapshot software

at 18 June 2007

Wink (http://www.debugmode.com/wink/) is a free software to catch screen-shots.

Main features:

  1. three modi: 1) take single screen shots 2) take shots at each mouse OR key action 3) take at real time. You can change between these modi in a session.
  2. You can choose to record a specific window only (for example slices) or the whole screen.
  3. You can edit the final screenshots: add voice, add text elements.
  4. You can interleave different sessions.
  5. The final formal is an adobe Flash swf-file, which is started via an html file. Any user is able to download Flash from Adobe.

Two commerical alternatives are camtasisa (microsoft http://www.techsmith.com/camtasia.asp about 300 dollar) and captivate (adobe http://www.adobe.com/products/captivate/ about 600 dollar).

Captive seems to be more like wink, as it allows to take individual snapshots. It has the addtitional freature to allow interative learning facilitites as quizzies etc.

See for a comparison of both:

http://www.streamingmedia.com/article.asp?id=9393

Ulrich

ICA (independent component analysis)

at 02 May 2007

Following the Epital-presentation in April, I surfed the web a bit about ICA (independent component analysis). It has - at least on the surface - a strong resemblance with factor analysis. Some web-links are:

http://en.wikipedia.org/wiki/Independent_Component_Analysis
http://www.sccn.ucsd.edu/~arno/indexica.html ("ICA for dummies")
http://www.cnl.salk.edu/~tewon/ica_cnl.html
http://www.cis.hut.fi/projects/ica/ (where additional links can be found)
A book on the topic is mentioned at http://www.cis.hut.fi/projects/ica/book

There are two R packages related to ICA: fastICA and mlica, see also http://cran.r-project.org/src/contrib/Views/Multivariate.html

R for SAS and SPSS users



I came across this document - RforSAS&SPSSusers.pdf - which gives a nice R-introduction to people familiar with other statistical packages. It might be of use in connection with various courses.

Lameness of dairy cattle - early identification and consequences for behaviour and production

at 19 April 2007

The purpose is to develop and optimize a novel tool for early identification of dairy cattle with lameness in order to increase animal welfare, product quality and producer profit. Moreover, the consequences of lameness for animal welfare and production will be illuminated.

Development of the Virtual Slaughter



The purpose of the project is to develop 3-D models of the body of slaughter pigs based on CT images for

 

  1. Product optimization: characterization of slaughter quality is related to marked demands.
  2. Product development: new dissections of the body performed via the virtual model can be analyzed with respect to slaughter quality and economical consequences.
  3. Predictor finding: Identification of measurable characteristics of the slaughter body related to slaughter quality that can be used for online measurements and consultation for farmers under production.

Our group is responsible for the identification of measurable predictors (part 3). An experiment is conducted by Pia Nissen (Department of Food Science, DJF) to gain information about the development of the pig body and to test the robustness of predictors developed on already scanned slaughter pigs.

The project is lead by Eli Vibeke Olsen(Danish Meat Association) in collaboration with DTU and DJF.

 

Project repsonsible (part 3): Ulrich Halekoh

ulrich.halekoh@agrsci.dk

Project period: 2006-2008 (months 8)

AUREGAB - Automated registration of animal behaviour

at 10 April 2007

The AUREGAB project aims at developing new tools for monitoring and for early identification of dairy cows with reduced appetite, with lameness problems and cows in heat. In the project we use Bluetooth technology for determining the positions and lying behaviour of the animals.

Project period: 2006-2008

Project participants from our group include Frede Aakman Tøgersen and Søren Højsgaard (project leader)

ILSORM - Integrated risk management



The ILSORM project is about integrated risk management in dairy production using information technology and other modern technologies. Some keywords for our activities in the project are: sensor measurements, dynamic models, decision making, optimal decisions.

Project period: 2007-2009.

Project participants from our group include Lars Relund Nielsen, Asger Roer Pedersen and Søren Højsgaard.

Statistical Reviewers Improve Reporting in Biomedical Articles: A Randomized Trial

at 28 March 2007

Det er et stykke tid siden, der har været indlæg på bloggen, så hvorfor ikke dette link til en artikel i tidsskriftet PLoS One:

Statistical Reviewers Improve Reporting in Biomedical Articles: A Randomized Trial

Og jeg ved godt at det ikke er alle, der vil betragte artiklens konklusioner som en nyhed.

Biosens II

at 13 March 2007

The Biosens II project deals with improved monitoring and management of dairy production and milk quality based on on-farm biosensors.

In the work package which we are primily involved with we focus on developing self-adjusting dynamic statistical models to predict oestrus and mastitis from inline measurements and Markov decision models to find optimal decision strategies.

Project period: 2007-2011

Project participants from our group include Lars Relund Nielsen, Erik Jørgensen and Søren Højsgaard

Data and program examples from SAS for Mixed Models

at 09 March 2007

In our current mixed model course we use the book SAS for Mixed Models, Second Edition by Ramon C. Littell, George A. Milliken, Walter W. Stroup, Russell D. Wolfinger, and Oliver Schabenberger



This is the first time we use the 2nd edition of the book. The changes in the second edition includes extensive use of the output delivery system (ODS) and examples using e.g. PROC NLMIXED and PROC GLIMMIX rather than the glimmix macro used in the 1st edition.


Participants in our previous courses might be interested in updating the file with program and data examples. It can be downloaded from the companion page above. The link to the file is a little different to find on the page, so you may just follow this link to program examples and data

Web adgang til Journal of the Royal Statistical Society: Series AB..

at 05 March 2007

Jeg har nylig været på jagt efter en artikel i serie B i ovennævnte tidsskrift. Et weblink førte mig til IngentaConnect men der har vi kun adgang til abstracts. Så prøvede jeg bibliotekets tidsskriftliste på intranettet. Det førte mig til SwetsWise, men der var der kun adgang til table of contents.

Det næste forsøg førte til tidsskriftets udgivere, Blackwell, Journal of the Royal Statistical Society. Der var der succes !!.

(Næsten, - april nummeret var tilgængeligt de andre steder, men endnu ikke på Blackwells sider - men mon ikke det dukker op.)

Jeg håber andre kan spare lidt tid ved at gå direkte til Blackwell.

Use feeds for tracking blog updates

at 06 February 2007

Ulrich spurgte til mails fra de mere specielle R-interesse-grupper. Det giver mig lejlighed til at udnytte et gammel draft om rss-feeds. En (af mange) oversigter over R-mail-lister findes på http://dir.gmane.org/index.php?prefix=gmane.comp.lang.r. Fordelen ved den oversigt er at det er muligt at modtage rss-feeds fra listen, så jeg vil starte med det tidligere draft om feeds, og dernæst vende tilbage til R-mails's 

Currently the members of SBT automatically receives an e-mail when the SBT-blog is updated. A better alternative is to use the feeds (rss atom etc) generated from the blog. (If you do not know what a feed is, it should be easy to find some information on the web about feed handling (e.g. the wikipedia entry). Both Firefox and Internet Explorer 7 can now handle such feeds. The web feed logo is shown in the address bar, just click on it and it should be possible to add the feed to your feed reader of choice.

I can strongly recommend Netvibes. Just click on the link and you visit the homepage with some suggested content added. Choose the Log Ind (login) option and you are able to register yourself as a user.  When you are logged in you can simple click on the netvibes button in the right column. Then the feed is added to your netvibes page.

As another example just push this button Add to Netvibes and a feed with the latest contents of the Journal of the Royal Statistical Society  - Series B is added.

Og så tilbage til R-mails. Linken http://dir.gmane.org/index.php?prefix=gmane.comp.lang.r indeholder links til de forskellige R-lister. For eksempel lme4-development listen (http://dir.gmane.org/gmane.comp.lang.r.lme4.devel) . Ved at besøge denne lme4 side kan man se flere tilgange til at se mails fra listen. De nederste 4 er alle RSS-feeds med forskellig detaljeringsgrad. Ved at at abonnere på disse i Netvibes får man en automatisk opdateret  liste med emner fra maillisten og kan få mere detaljerede oplysninger, hvis emnet er interessant. Og hvad der måske er mere nyttigt, de uinteressante emner forsvinder automatisk efter et stykke tid.

Dogme møder

at 04 February 2007

Kalenderen for dogme møder er opdateret da der var en del ændringer den næste måneds tid.

Til foredragsholderne: Tjek kalenderen for at se hvornår du skal holde foredrag og husk at indtaste en titel for dit foredrag. Du kan også udsende en mail til alle deltagerne i vores gruppe.

Indtastning af et indlæg via live writer

at 30 January 2007

En måde at indtaste et indlæg er at bruge live writer der er et lille program hvor man kan skrive sine indlæg 'offline' og herefter ligge dem i en blog.

Live writer er et godt program, hvis man fx har flere blogs at administre og ønsker at kunne sidde og arbejde i et 'Wordlike' interface.  

For at bruge live writer skal du gøre følgende:

  1. Installer live writer og åbne programmet.
  2. Du skal nu tilføje din blog hvilket gøres ved at indtaste url'en til bloggen.
  3. Du skal også bruge dit brugernavn og dit password.
  4. Herefter skulle du kunne indtaste et indlæg og trykke på knappen 'publish' når du ønsker at udgive indlægget.

Fejl i SAS

at 17 January 2007

Både SAS version 8.2 og 9.1.3 laver forkerte beregninger, når man har mere end een random sætninger hvor covarians har: type=sp(...)

Der finde opdateringer til begee versioner af SAS:


Hotfix for V9:
http://ftp.sas.com/techsup/download/hotfix/e9_win_sbcs.html#e9st05
Hotfix for V8:
http://ftp.sas.com/techsup/download/hotfix/82_win_sbcs.html#82ST19

For at installere Hotfix for vesion 9 skal man først installeres Sevicepack 4
(http://ftp.sas.com/techsup/download/hotfix/e9_win_sbcs.html#e9h004)

Jeg har bedt om at få dette rettet på Linux, men det er p.t. endnu ikke gjort.

SASweave - writing LaTeX docs and SAS (and R) code together

at 12 January 2007

A looooong time ago I wrote a small package allowing combining LaTeX, SAS and R into one document. I never took the work any further than the first version...

Russ Lenth from Iowa found it on the Web and he has now created a fully functional program called "SASweave" (see the link) - mainly for working with SAS and LaTeX together, but it is also possible to use it additionally with R code. We are currently writing a little paper about the program, so any input/comments/experiences from you will be appreciated. Currently I am using it when preparing the lecture slides etc. for out mixed model course in Spring 2007. It works fairly well - if I may say so...

Course in generalized linear models

at 09 January 2007

In spring 2007 (start: 26. February) the Unit of Statistic and Decision Analysis offers a course in generalized linear models with biological applications.

The course introduces the modeling (regression analysis) of non-normal observations as for example counts and proportions. Methods to handle correlated observations will also be introduced. The lectures are accompanied by computing exercises using the statistical package ‘R’. Due to its versatility and graphical capacities it has something to offer even to the experienced SAS user.

The course is open to everyone with basic knowledge in statistics. The course is free of charge for PhD students, and other students and employees affiliated to the faculty of agricultural sciences. For other participants the course fee is 12000 DKK. The course has been approved as a PhD course (9 ECTS points) on former KVL.

For additional information please check the course homepage
http://genetics.agrsci.dk/statistics/courses/phd07/


Best regards

Ulrich Halekoh

Using the SQLite database in connection with SAS and R

at 03 January 2007

I have made this small document which explains how to set up an SQLite database and use it from both SAS and R. Working with SQLite is very easy. Comments will be appreciated; I guess the document could be of use in connection with our teaching.