Rで単体テスト

2012-01-09

(

)

みなさん、ちゃんと単体テスト（ユニットテスト）してますか？
あまり再利用性のない分析コードだと縁のない話だと思うんですが、パッケージを開発するとか、よく使い回す関数を作成する場合はテストを書くべきです。
R で単体テストをする場合は RUnit を利用するのが普通かと思います。
そんなわけで、RUnit の実用的な使い方について説明します！

とりあえず使ってみる

RUnit には値のチェック用の関数として次のような関数が用意されています。

関数名	概要
checkEquals	2つのオブジェクトが同じ構造で（ほぼ）同じ値かのチェック
checkEqualsNumeric	2つのオブジェクトをベクトル化した際に（ほぼ）同じ数値ベクトルになるかのチェック
checkIdentical	2つのオブジェクトが全く同じ構造化のチェック
checkTrue	値が TRUE かのチェック
checkException	例外（エラー）を発生させるかのチェック

いろいろありますが、checkEquals と checkException の2つだけ知っておけば問題ないと思います。

今回は、テキストを渡すとその中に含まれている Twitter ID（スクリーンネーム）の個数を返す関数（countScreenNames）を作成することにします。
Twitter Web で Twitter ID とみなされるパターンを調べてみたところ、例えば次のようなテストが書けそうです。

test.count_screen_names.R

test.countScreenNames <- function() {
    test.cases <- list(list(# @ の手前がアルファベット、数値、アンダースコアだと ID とみなされない
                            text     = "a_bicky@example.com",  
                            expected = table(character(0))),
                       list(text     = "ほげ@a_bicky",
                            expected = table("a_bicky")),
                       list(# Twitter ID の後に @ が続くと ID とみなされない
                            text     = "@a_bicky@a_bicky",
                            expected = table(character(0))),
                       list(text     = "@a_bicky @a_bicky",
                            expected = table(c("a_bicky", "a_bicky"))),
                       list(# @ の後ろがアルファベット、数値、アンダースコア以外だと ID とみなされない
                            text     = "@ほげ",
                            expected = table(character(0))))

    for (test.case in test.cases) {
        received <- countScreenNames(test.case$text)
        expected <- test.case$expected
        checkEquals(received, expected, msg = sprintf("\ntext: %s\n", test.case$text))
    }
}

※明らかに不十分ですがサンプルなので気にしないでください

runTestFile 関数にテストの関数を記述したファイルを指定すると、デフォルトだとそのファイルに記述されている test で始まる名前の関数が実行されます。

> library(RUnit)
> runTestFile("test.count_screen_names.R")


Executing test function test.countScreenNames  ... Timing stopped at: 0.003 0 0.003 
Error : could not find function "countScreenNames"
 done successfully.

Number of test functions: 1 
Number of errors: 1 
Number of failures: 0 

月並みな説明ですが、まだ関数を定義していないのでテストは通りません。
というわけで、関数を定義してみます。適当に書くとこんな感じでしょうか？

count_screen_names.R

countScreenNames <- function(text) {
    regex <- "(?x)
              (?:
                  ( [@＠] ) ( \\w+ )
                  | [\\s\\S]
              )"
    screenNames <- gsub(regex, "\\1\\2", text, perl = TRUE)
    table(strsplit(substring(screenNames, 2), "[@＠]")[[1]])
}

※R 2.14 から導入された regmatches を使うともっと簡単に書けるのではないかと思います

ではではもう一度テストを実行してみます。

> source("count_screen_names.R")
> result <- runTestFile("test.count_screen_names.R")


Executing test function test.countScreenNames  ... Timing stopped at: 0.003 0 0.003 
Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\n",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
text: a_bicky@example.com

 done successfully.

Number of test functions: 1 
Number of errors: 0 
Number of failures: 1 

おっと、どうやら失敗してるみたいですね・・・

もう一工夫して使ってみる

runTestFile は TestLogger というクラスのオブジェクトを返すので、printTextProtocol という関数でログの詳細を確認することができます。
先ほど result という変数で返り値を受け取ったので表示してみます。

> printTextProtocol(result)
RUNIT TEST PROTOCOL -- Mon Jan  9 09:48:57 2012 
*********************************************** 
Number of test functions: 1 
Number of errors: 0 
Number of failures: 1 

 
1 Test Suite : 
test - 1 test function, 0 errors, 1 failure
FAILURE in test.countScreenNames: Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\n",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
text: a_bicky@example.com




Details 
*************************** 
Test Suite: test 
Test function regexp: ^test.+ 
Test file regexp: ^test.count_screen_names.R$ 
Involved directory: 
. 
--------------------------- 
Test file: ./test.count_screen_names.R 
test.countScreenNames: FAILURE !! (check number 1)
Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\n",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
text: a_bicky@example.com

この結果を見ると、テストに失敗しているのは test.countScreenNames という関数で（今回はテスト用の関数が1つしかないので自明ですが…）、その関数内では1つのチェックが行われたこと（check number 1）がわかります。
checkEquals の msg にテストケースの text も渡しているので a_bicky@example.com というテキストを渡した場合にコケてるんだなぁとわかります。
※msgはテストに失敗した場合にだけ表示されます

RUnit 素晴らしい！！

・・・と思いますか？
テストフレームワークは Perl の Test::More と PHP の PHPUnit を使ったことがあるんですが、どちらも受け取った値と期待した値を表示してくれます。
PHPUnit ではあるテストケースのテストでコケると、それ以降のテストケースはテストされなかった気がしますが、Test::More だとエラーで落ちない限り全てのテストケースをテストできます。

個人的な要望としては

受け取った値と期待した値を表示してほしい
途中でテストにコケても全てのテストケースに対してをテストを実行してほしい
の2点です。

次のように checkEquals の msg に received と expected を渡せば1つ目の問題は解決できそうな気がしますが、いろいろと問題ありです。

checkEquals(received, expected,
            msg = sprintf("\ntext: %s\nreceived\n%s\nexpected\n%s",
                          test.case$text, received, expected))

実行してみます。

> result <- runTestFile("test.count_screen_names.R")


Executing test function test.countScreenNames  ... Timing stopped at: 0.004 0 0.004 
Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\nreceived\n%s\nexpected\n%s",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
 done successfully.

何も表示されません・・・
これは expected が character(0) であり、sprintf(“expected: %s”, character(0)) の結果が character(0) になるからです。
また、received も expected も character であればまだいいんですが、今回のように table だと適切に表示させるには一苦労します。

2つ目の問題に関しては、try を入れることで解決しそうです。

try(checkEquals(received, expected, msg = sprintf("\ntext: %s\n", test.case$text)))

実行してみます。

> result <- runTestFile("test.count_screen_names.R")


Executing test function test.countScreenNames  ... Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\n",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
text: a_bicky@example.com

Error in checkEquals(received, expected, msg = sprintf("\ntext: %s\n",  : 
  names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ
text: @a_bicky@a_bicky

 done successfully.

> printTextProtocol(result)
RUNIT TEST PROTOCOL -- Mon Jan  9 10:16:01 2012 
*********************************************** 
Number of test functions: 1 
Number of errors: 0 
Number of failures: 0 

 
1 Test Suite : 
test - 1 test function, 0 errors, 0 failures



Details 
*************************** 
Test Suite: test 
Test function regexp: ^test.+ 
Test file regexp: ^test.count_screen_names.R$ 
Involved directory: 
. 
--------------------------- 
Test file: ./test.count_screen_names.R 
test.countScreenNames: (5 checks) ... OK (0.01 seconds)

5つ全てのテストケースがテストされていますが・・・テスト通っちゃいましたね・・・

ということで Runit コマンド作りました！！

作りました！！Inspired by PHPUnit です。
command line tool for RUnit, which is a unit test framework for R ― Gist

インストール

例えば次のようにインストールします

$ wget https://raw.github.com/gist/1580439 -O Runit
$ chmod +x Runit
$ sudo mv Runit /usr/local/bin

ひな形の作成

オプションとして –make-skeleton をつけて、テストしたい関数が記述されているファイルを指定します。

$ Runit --make-skeleton count_screen_names.R
Making skeleton...
Wrote skeleton for 'count_screen_names.R' to '/Users/arabiki/work/r/test.count_screen_names.R'.

ってな感じで、指定したファイルの存在するディレクトリに test.count_screen_names.R が作成されます。
※同名ファイルが存在する場合は上書きするかどうか聞かれます

作成されたスクリプトの中身はこんな感じです。

#-----------------------------------------------------------
# Test script for count_screen_names.R
# Generated by Runit command on 2012-01-09 at 10:35:59
#-----------------------------------------------------------
source("/Users/arabiki/work/r/count_screen_names.R")

# This function is executed directly before each test function execution
.setUp <- function() {
}

# This function is executed directly after each test function execution
.tearDown <- function() {
}

test.countScreenNames <- function() {
    # Remove the following line when you implement this test.
    DEACTIVATED("This test has not been implemented yet.")

    # next code is executed if this test script is executed using Runit command
    if (exists("checkFailure") && is.function(checkFailure)) {
        checkFailure()
    }
}

checkFailure という関数がありますが、これは Runit コマンドの中で定義されている関数です。
Runit コマンドの中では checkEquals などの関数を上書きしていて、テストに失敗しても終了しないようにしています。
ただ、それだとテストが通ったように見えてしまうため、最後にこの関数でエラーがあったかどうか確認します。

テストの実行

最終的に次のようなテストスクリプトにしました。showDiff は Runit の中で定義されている関数で、第1引数にテストする値、第2引数に期待する値を指定し、第3引数にはそれらの値を表示する前に表示するメッセージを指定します。

#-----------------------------------------------------------
# Test script for count_screen_names.R
# Generated by Runit command on 2012-01-09 at 10:35:59
#-----------------------------------------------------------
source("/Users/arabiki/work/r/count_screen_names.R")

# This function is executed directly before each test function execution
.setUp <- function() {
}

# This function is executed directly after each test function execution
.tearDown <- function() {
}

test.countScreenNames <- function() {
    test.cases <- list(list(text     = "a_bicky@example.com",
                            expected = table(character(0))),
                       list(text     = "ほげ@a_bicky",
                            expected = table("a_bicky")),
                       list(text     = "@a_bicky@a_bicky",
                            expected = table(character(0))),
                       list(text     = "@a_bicky @a_bicky",
                            expected = table(c("a_bicky", "a_bicky"))),
                       list(text     = "@ほげ",
                            expected = table(character(0))))

    for (test.case in test.cases) {
        received <- countScreenNames(test.case$text)
        expected <- test.case$expected
        checkEquals(received, expected,
                    msg = showDiff(received, expected, sprintf("\ntext: %s", test.case$text)))
    }

    # next code is executed if this test script is executed using RUnit command
    if (exists("checkFailure") && is.function(checkFailure)) {
        checkFailure()
    }
}

実行してみます。

$ Runit test.count_screen_names.R


Executing test function test.countScreenNames  ... Timing stopped at: 0.015 0.001 0.017 
 done successfully.

RUNIT TEST PROTOCOL -- Mon Jan  9 10:43:26 2012 
*********************************************** 
Number of test functions: 1 
Number of errors: 0 
Number of failures: 1 

 
1 Test Suite : 
test - 1 test function, 0 errors, 1 failure



Details 
*************************** 
Test Suite: test 
Test function regexp: ^test.+ 
Test file regexp: ^test.count_screen_names.R$ 
Involved directory: 
/Users/arabiki/work/r 
--------------------------- 
Test file: /Users/arabiki/work/r/test.count_screen_names.R 
test.countScreenNames: FAILURE !! (check number 5)
Error in checkFailure() : received below error messages
-- error messages --
Error in RUnit::checkEquals(...) : names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ

text: a_bicky@example.com
received:

example 
      1 

expected:
character(0)


Error in RUnit::checkEquals(...) : names for target but not for current
Attributes: < Component 2: Mean relative difference: 1 >
Attributes: < Component 3: Component 1: Modes: character, NULL >
Attributes: < Component 3: Component 1: target is character, current is NULL >
Numeric: lengths (1, 0) differ

text: @a_bicky@a_bicky
received:

a_bicky 
      2 

expected:
character(0)

という感じで、test.count_screen_names.R ではテストケース5つ全てをテストし、エラーが2つ出ていて、受け取った値と期待した値の対応まで確認することができます。
どのテストでコケてるかわかったので、それらのテストが通るように関数を書き直します。

count_screen_names.R

countScreenNames <- function(text, strict = TRUE) {
    regex <- "(?x)
              (?:
                  (?<!\\w) ( [@＠] ) ( (?>\\w+) ) (?![@＠])
                  | [\\s\\S]
              )"
    screenNames <- gsub(regex, "\\1\\2", text, perl = TRUE)
    table(strsplit(substring(screenNames, 2), "[@＠]")[[1]])
}

テストを実行してみます。

$ Runit test.count_screen_names.R


Executing test function test.countScreenNames  ...  done successfully.

RUNIT TEST PROTOCOL -- Mon Jan  9 10:46:56 2012 
*********************************************** 
Number of test functions: 1 
Number of errors: 0 
Number of failures: 0 

 
1 Test Suite : 
test - 1 test function, 0 errors, 0 failures



Details 
*************************** 
Test Suite: test 
Test function regexp: ^test.+ 
Test file regexp: ^test.count_screen_names.R$ 
Involved directory: 
/Users/arabiki/work/r 
--------------------------- 
Test file: /Users/arabiki/work/r/test.count_screen_names.R 
test.countScreenNames: (5 checks) ... OK (0.01 seconds)

テスト通りましたね！！

ちなみに、RUnit では複数のファイルに渡って記述されたテストを実行するための関数として runTestSuite があったり、テスト中コードのどの部分を通ったかをチェックするための仕組みがあったりしますが、その辺は id:yokkuns さんの以下の資料をご参照ください。
Tokyor14 - R言語でユニットテスト

ではでは快適テスト生活をお過ごしください！！

P.S. Runitコマンドの需要があればもうちょっとましな感じに作り込みます