OAuthに"対応"したtwitteRを試してみた
いつの間にかROAuthというパッケージが出てtwitteRがOAuth認証に”対応”していましたね!
早速使ってみました!
初期化
まずTwitterアプリを持っていない人はここからアプリを作成しましょう。
アプリの情報から “Consumer key”, “Consumer secret”, “Access token”, “Access token secret” をメモっておきます
※アプリのアクセス権限は “Read, write, and direct messages” にしておきましょう
init_twitteR.R
library(twitteR)
library(ROAuth)
# 初期化
twit.oauth <- OAuthFactory$new(handshakeComplete = TRUE,
signMethod = "HMAC",
consumerKey = "Your Consumer Key",
consumerSecret = "Your Consumer secret",
oauthKey = "Your Access token",
oauthSecret = "Your Access token secret")
# OAuth認証の登録
registerTwitterOAuth(twit.oauth)
あとはRJSONIO 0.3じゃないと日本語を扱えないので古いものをインストールしてください
$ sudo R CMD remove RJSONIO
$ wget http://cran.r-project.org/src/contrib/Archive/RJSONIO/RJSONIO_0.3-1.tar.gz
$ sudo R CMD INSTALL RJSONIO_0.3-1.tar.gz
ツイートの取得
手始めに非公開ユーザのツイートを取得してみます。
> source(init_twitteR.R)
> statuses <- userTimeline("a_bicky_dev")
> statuses[[1]]
[1] "a_bicky_dev: あ@a_bicky"
おおぉぉぉ、非公開ユーザなのにちゃんと取得できてますね!
twitteR は R5 を採用していて、status オブジェクトに対して $toDataFrame() で簡単にデータフレームに変換できます!
> statuses[[1]]$toDataFrame()
text favorited replyToSN created truncated replyToSID id
1 あ@a_bicky TRUE NA <NA> TRUE NA 2055340032
replyToUID statusSource screenName
1 NA web a_bicky_dev
おぉ、素晴らしいですね!!
twListToDFでまとめてデータフレームにすることもできます!
> head(twListToDF(statuses))
text favorited replyToSN created truncated replyToSID
1 あ@a_bicky TRUE <NA> <NA> TRUE NA
2 @hoge@ TRUE hoge <NA> TRUE NA
3 @ @hoge TRUE <NA> <NA> TRUE NA
4 ,#hoge .#hoge TRUE <NA> <NA> TRUE NA
5 %#hoge TRUE <NA> <NA> TRUE NA
6 !#hoge ?#hoge >#hoge TRUE <NA> <NA> TRUE NA
id replyToUID statusSource screenName
1 2055340032 <NA> web a_bicky_dev
2 -1157484543 3272141 web a_bicky_dev
3 -213774336 <NA> web a_bicky_dev
4 -608026624 <NA> web a_bicky_dev
5 1740775424 <NA> web a_bicky_dev
6 1946288128 <NA> web a_bicky_dev
素晴らしいっ!!!!
・・・あれ?created が NA ですね。
ちょっとおまじないをします。
> Sys.setlocale("LC_TIME", "en_US.utf-8")
> statuses <- userTimeline("a_bicky_dev")
> head(twListToDF(statuses))
text favorited replyToSN created truncated
1 あ@a_bicky TRUE <NA> 2011-11-06 13:12:08 TRUE
2 @hoge@ TRUE hoge 2011-11-06 13:11:10 TRUE
3 @ @hoge TRUE <NA> 2011-11-06 13:10:08 TRUE
4 ,#hoge .#hoge TRUE <NA> 2011-11-06 12:55:09 TRUE
5 %#hoge TRUE <NA> 2011-11-06 12:54:12 TRUE
6 !#hoge ?#hoge >#hoge TRUE <NA> 2011-11-06 12:52:51 TRUE
replyToSID id replyToUID statusSource screenName
1 NA 2055340032 <NA> web a_bicky_dev
2 NA -1157484543 3272141 web a_bicky_dev
3 NA -213774336 <NA> web a_bicky_dev
4 NA -608026624 <NA> web a_bicky_dev
5 NA 1740775424 <NA> web a_bicky_dev
6 NA 1946288128 <NA> web a_bicky_dev
おっ、ちゃんと日付が表示されました。
・・・よく見たらidがおかしいですね・・・
> str(twListToDF(statuses))
'data.frame': 20 obs. of 10 variables:
$ text : Factor w/ 20 levels "あ@a_bicky",..: 1 2 3 4 5 6 7 8 9 10 ...
$ favorited : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ replyToSN : chr NA "hoge" NA NA ...
$ created : POSIXct, format: "2011-11-06 13:12:08" "2011-11-06 13:11:10" ...
$ truncated : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ replyToSID : logi NA NA NA NA NA NA ...
$ id : Factor w/ 20 levels "2055340032","-1157484543",..: 1 2 3 4 5 6 7 8 9 10 ...
$ replyToUID : chr NA "3272141" NA NA ...
$ statusSource: Factor w/ 1 level "web": 1 1 1 1 1 1 1 1 1 1 ...
$ screenName : Factor w/ 1 level "a_bicky_dev": 1 1 1 1 1 1 1 1 1 1 ...
status_id が factor になってるじゃないですか!
まぁ置いときましょう。
Rからツイート!!
やっぱり twitteR の醍醐味はこれですよね!Rユーザだったら誰もがRからツイートしたいと思うはずです!
R から思いの丈をツイートしてみましょう!
> tweet("ありすちゃんとてれすくんが可愛すぎて生きてるのが辛い")
Error in .self$twFromJSON(out) :
Error: Could not authenticate with OAuth.
あれ?ツイートの内容が悪かったんですかね????
> tweet("I love R, I love R!!")
Error in .self$twFromJSON(out) :
Error: Could not authenticate with OAuth.
R にまで嫌われると僕はこの先どうやって生きていけばいいんですか・・・
taskStatusで処理が終了したら通知してもらおう!
taskStatus なんて関数ができていて、第1引数の処理が終了したら指定したユーザにDMを飛ばすことができるんですって!
> taskStatus("愛をください", "a_bicky")
Error in params[["text"]] <- text : object 'params' not found
・・・あれ?ちょっと欲張っちゃいましたかね??
twitteR使えないじゃん!!
今回新しくなったtwitteRを使ってみましたが、結論としては
全然使えない!
ちゃんと対応してほしいですね。以上。
・・・・
で終わらないためにパッチ作りましたよ!パッチ作りましたよ!!
ROAuth.patch
diff -cr ROAuth.orig/R/ROauth.R ROAuth/R/ROauth.R
*** ROAuth.orig/R/ROauth.R 2011-06-03 13:10:28.000000000 +0900
--- ROAuth/R/ROauth.R 2011-11-19 19:13:36.000000000 +0900
***************
*** 128,146 ****
auth <- signRequest(url, params, consumerKey, consumerSecret,
oauthKey=oauthKey, oauthSecret=oauthSecret,
httpMethod="POST", signMethod=signMethod)
opts <- list(...)
## post ,specify the method
- ## We should be able to use postForm() but we have to work out the issues
- ## with escaping, etc. to match the signature mechanism.
- if(TRUE) {
- reader <- dynCurlReader(curl, baseURL = url, verbose = FALSE)
- fields <- paste(names(auth), sapply(auth, curlPercentEncode),
- sep = "=", collapse = "&")
- curlPerform(curl = curl, URL = url, postfields = fields,
- writefunction = reader$update, ...)
- reader$value()
- } else
postForm(url, .params = c(params, lapply(auth, I)), curl = curl,
.opts = opts, style = "POST")
}
--- 128,137 ----
auth <- signRequest(url, params, consumerKey, consumerSecret,
oauthKey=oauthKey, oauthSecret=oauthSecret,
httpMethod="POST", signMethod=signMethod)
+ auth$oauth_signature <- encodeURI(auth$oauth_signature)
opts <- list(...)
## post ,specify the method
postForm(url, .params = c(params, lapply(auth, I)), curl = curl,
.opts = opts, style = "POST")
}
diff -cr ROAuth.orig/R/sign.R ROAuth/R/sign.R
*** ROAuth.orig/R/sign.R 2011-06-03 13:10:28.000000000 +0900
--- ROAuth/R/sign.R 2011-11-19 19:11:10.000000000 +0900
***************
*** 2,13 ****
oauthKey = "", oauthSecret = "", httpMethod = "GET",
signMethod = "HMAC", nonce = genNonce(),
timestamp = Sys.time(),
! escapeFun = curlPercentEncode) {
## Sign an request made up of the URL, the parameters as a named character
## vector the consumer key and secret and the token and token secret.
httpMethod <- toupper(httpMethod)
signMethod <- toupper(signMethod)
!
params["oauth_nonce"] <- nonce
params["oauth_timestamp"] <- as.integer(timestamp)
--- 2,14 ----
oauthKey = "", oauthSecret = "", httpMethod = "GET",
signMethod = "HMAC", nonce = genNonce(),
timestamp = Sys.time(),
! escapeFun = encodeURI) {
## Sign an request made up of the URL, the parameters as a named character
## vector the consumer key and secret and the token and token secret.
httpMethod <- toupper(httpMethod)
signMethod <- toupper(signMethod)
!
! params <- sapply(params, escapeFun, simplify = FALSE)
params["oauth_nonce"] <- nonce
params["oauth_timestamp"] <- as.integer(timestamp)
***************
*** 28,34 ****
## the resulting % prefix in the escaped characters, e.g. %20 becomes
## %2520 as %25 is the escape for %
params <- params[order(names(params))]
! args <- paste(names(params), sapply(params, escapeFun, post.amp = TRUE),
sep = "%3D", collapse = "%26")
if(is.null(oauthSecret))
--- 29,35 ----
## the resulting % prefix in the escaped characters, e.g. %20 becomes
## %2520 as %25 is the escape for %
params <- params[order(names(params))]
! args <- paste(names(params), sapply(params, escapeFun),
sep = "%3D", collapse = "%26")
if(is.null(oauthSecret))
***************
*** 42,48 ****
sig <- signString(odat, okey, signMethod)
! params["oauth_signature"] <- sig # curlPercentEncode(sig)
params[grepl("^oauth_", names(params))]
}
--- 43,49 ----
sig <- signString(odat, okey, signMethod)
! params["oauth_signature"] <- sig
params[grepl("^oauth_", names(params))]
}
***************
*** 98,100 ****
--- 99,117 ----
signWithPlaintext <- function(key, data) {
key
}
+
+ encodeURI <- function(URI) {
+ if (!is.character(URI)) {
+ URI
+ } else {
+ OK <- "[^-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_.~]"
+ x <- strsplit(URI, "")[[1L]]
+ z <- grep(OK, x)
+ if (length(z)) {
+ y <- sapply(x[z], function(x) paste("%", toupper(as.character(charToRaw(x))),
+ sep = "", collapse = ""))
+ x[z] <- y
+ }
+ paste(x, collapse = "")
+ }
+ }
twitteR.patch
diff -cr twitteR.orig/R/base.R twitteR/R/base.R
*** twitteR.orig/R/base.R 2011-11-09 07:15:21.000000000 +0900
--- twitteR/R/base.R 2011-11-19 23:29:19.000000000 +0900
***************
*** 1,7 ****
setRefClass('twitterObj',
contains='VIRTUAL',
methods = list(
! toDataFrame = function(row.names=NULL, optional=FALSE) {
fields <- names(.self$getRefClass()$fields())
fieldList <- lapply(fields, function(x) {
val <- .self$field(x)
--- 1,7 ----
setRefClass('twitterObj',
contains='VIRTUAL',
methods = list(
! toDataFrame = function(row.names=NULL, optional=FALSE, stringsAsFactors=FALSE) {
fields <- names(.self$getRefClass()$fields())
fieldList <- lapply(fields, function(x) {
val <- .self$field(x)
***************
*** 12,18 ****
})
names(fieldList) <- fields
as.data.frame(fieldList, row.names=row.names,
! optional=optional)
}
)
)
--- 12,18 ----
})
names(fieldList) <- fields
as.data.frame(fieldList, row.names=row.names,
! optional=optional, stringsAsFactors=stringsAsFactors)
}
)
)
diff -cr twitteR.orig/R/comm.R twitteR/R/comm.R
*** twitteR.orig/R/comm.R 2011-11-17 08:39:16.000000000 +0900
--- twitteR/R/comm.R 2011-11-19 23:29:19.000000000 +0900
***************
*** 201,207 ****
## epoch time, and then try a few data string formats
dateInt <- suppressWarnings(as.numeric(dateStr))
if (!is.na(dateInt)) {
! posDate <- as.POSIXct(dateInt, origin='1970-01-01')
} else {
posDate <- as.POSIXct(dateStr, tz='UTC',
format="%a %b %d %H:%M:%S +0000 %Y")
--- 201,207 ----
## epoch time, and then try a few data string formats
dateInt <- suppressWarnings(as.numeric(dateStr))
if (!is.na(dateInt)) {
! posDate <- as.POSIXct(dateInt, tz='UTC', origin='1970-01-01')
} else {
posDate <- as.POSIXct(dateStr, tz='UTC',
format="%a %b %d %H:%M:%S +0000 %Y")
diff -cr twitteR.orig/R/dm.R twitteR/R/dm.R
*** twitteR.orig/R/dm.R 2011-11-09 07:15:21.000000000 +0900
--- twitteR/R/dm.R 2011-11-19 23:29:19.000000000 +0900
***************
*** 37,44 ****
senderID <<- json[['sender_id']]
if (!is.null(json[['sender_screen_name']]))
senderSN <<- json[['sender_screen_name']]
! if (!is.null(json[['id']]))
! id <<- json[['id']]
}
callSuper(...)
},
--- 37,44 ----
senderID <<- json[['sender_id']]
if (!is.null(json[['sender_screen_name']]))
senderSN <<- json[['sender_screen_name']]
! if (!is.null(json[['id_str']]))
! id <<- json[['id_str']]
}
callSuper(...)
},
***************
*** 95,104 ****
if (!hasOAuth())
stop("dmSend requires OAuth authentication")
if (inherits(user, "user"))
! user <- screenName(user)
if (nchar(text) > 140)
stop("Maximum of 140 chars may be sent via a direct message")
! params[['text']] <- text
res <- twInterfaceObj$doAPICall('direct_messages/new',
params=params, method='POST', ...)
dmFactory$new(res)
--- 95,108 ----
if (!hasOAuth())
stop("dmSend requires OAuth authentication")
if (inherits(user, "user"))
! uParams <- c(list(...)$uParams, list(screen_name = screenName(user)))
! else
! uParams <- parseUsers(user)
if (nchar(text) > 140)
stop("Maximum of 140 chars may be sent via a direct message")
! params <- c(list(...)$params, list(text = text))
! params[['user_id']] <- uParams[['user_id']]
! params[['screen_name']] <- uParams[['screen_name']]
res <- twInterfaceObj$doAPICall('direct_messages/new',
params=params, method='POST', ...)
dmFactory$new(res)
diff -cr twitteR.orig/R/statuses.R twitteR/R/statuses.R
*** twitteR.orig/R/statuses.R 2011-11-09 07:15:21.000000000 +0900
--- twitteR/R/statuses.R 2011-11-23 01:29:51.000000000 +0900
***************
*** 33,43 ****
}
if (!is.null(json[['text']]))
text <<- json[['text']]
! if (is.null(json[['favorited']]))
favorited <<- FALSE
else
favorited <<- TRUE
! if (is.null(json[['truncated']]))
truncated <<- FALSE
else
truncated <<- TRUE
--- 33,45 ----
}
if (!is.null(json[['text']]))
text <<- json[['text']]
! if ((is.null(json[['favorited']])) ||
! (json[['favorited']] == FALSE))
favorited <<- FALSE
else
favorited <<- TRUE
! if ((is.null(json[['truncated']])) ||
! (json[['truncated']] == FALSE))
truncated <<- FALSE
else
truncated <<- TRUE
***************
*** 50,63 ****
if ((!is.null(json[['in_reply_to_screen_name']])) &&
(!is.na(json[['in_reply_to_screen_name']])))
replyToSN <<- json[['in_reply_to_screen_name']]
! if ((!is.null(json[['in_reply_to_status_id']])) &&
! (!is.na(json[['in_reply_to_status_id']])))
! replyToSID <<- as.character(json[['in_reply_to_status_id']])
! if ((!is.null(json[['in_reply_to_user_id']])) &&
! (!is.na(json[['in_reply_to_user_id']])))
! replyToUID <<- as.character(json[['in_reply_to_user_id']])
! if (!is.null(json[['id']]))
! id <<- as.character(json[['id']])
}
callSuper(...)
}
--- 52,65 ----
if ((!is.null(json[['in_reply_to_screen_name']])) &&
(!is.na(json[['in_reply_to_screen_name']])))
replyToSN <<- json[['in_reply_to_screen_name']]
! if ((!is.null(json[['in_reply_to_status_id_str']])) &&
! (!is.na(json[['in_reply_to_status_id_str']])))
! replyToSID <<- as.character(json[['in_reply_to_status_id_str']])
! if ((!is.null(json[['in_reply_to_user_id_str']])) &&
! (!is.na(json[['in_reply_to_user_id_str']])))
! replyToUID <<- as.character(json[['in_reply_to_user_id_str']])
! if (!is.null(json[['id_str']]))
! id <<- as.character(json[['id_str']])
}
callSuper(...)
}
diff -cr twitteR.orig/R/trends.R twitteR/R/trends.R
*** twitteR.orig/R/trends.R 2011-11-09 07:15:21.000000000 +0900
--- twitteR/R/trends.R 2011-11-19 23:29:19.000000000 +0900
***************
*** 51,57 ****
buildTrend <- function(json, date) {
## we don't need to do the fancy twitter date mapping, this one is
## already ok
! trendFactory$new(json, as.POSIXct(date))
}
setMethod('show', signature='trend', function(object) {
--- 51,57 ----
buildTrend <- function(json, date) {
## we don't need to do the fancy twitter date mapping, this one is
## already ok
! trendFactory$new(json, as.POSIXct(date, tz='UTC'))
}
setMethod('show', signature='trend', function(object) {
diff -cr twitteR.orig/R/users.R twitteR/R/users.R
*** twitteR.orig/R/users.R 2011-11-09 10:10:49.000000000 +0900
--- twitteR/R/users.R 2011-11-19 23:29:19.000000000 +0900
***************
*** 60,67 ****
verified <<- TRUE
if (is.character(json[['screen_name']]))
screenName <<- json[['screen_name']]
! if (!is.null(json[['id']]))
! id <<- as.character(json[['id']])
if (!is.null(json[['location']]))
location <<- json[['location']]
}
--- 60,67 ----
verified <<- TRUE
if (is.character(json[['screen_name']]))
screenName <<- json[['screen_name']]
! if (!is.null(json[['id_str']]))
! id <<- as.character(json[['id_str']])
if (!is.null(json[['location']]))
location <<- json[['location']]
}
パッチを当ててみます。
$ wget http://cran.r-project.org/src/contrib/ROAuth_0.9.0.tar.gz
$ tar xvfz ROAuth_0.9.0.tar.gz
$ patch -c -p1 -d ROAuth < ROAuth.patch
$ sudo R CMD remove ROAuth
$ sudo R CMD INSTALL ROAuth
$ wget http://cran.r-project.org/src/contrib/twitteR_0.99.14.tar.gz
$ tar xvfz twitteR_0.99.14.tar.gz
$ patch -c -p1 -d twitteR < twitteR.patch
$ sudo R CMD remove twitteR
$ sudo R CMD install twitteR
気を取り直して・・・
気を取り直して実行してみます!!ドキドキ・・・
$ R
> source("init_twitteR.R")
> Sys.setlocale("LC_TIME", "en_US.utf-8")
> tweet("ありすちゃんとてれすくんが可愛すぎて生きてるのが辛い")
[1] "a_bicky: ありすちゃんとてれすくんが可愛すぎて生きてるのが辛い"
おおおおぉぉぉぉぉ、思いが届きましたよ!!!!
> taskStatus("愛をください", "a_bicky")
[1] "愛をください"
> dms <- dmGet()
> dms[[1]]
[1] "a_bicky->a_bicky: Your task has completed successfully:"
> dms[[1]]$destroy()
[1] TRUE
おおおおぉぉぉぉぉ、便利!!!!
そんなわけで、結論
twitteR 超便利!!
これで R からツイートできなくてストレスを抱えていた人もストレスフリーな R ライフが送れますね!!
※追記
twitteRの作者にメールしました。
- ALPHA, DIGIT, ‘-‘, ‘.’, ‘_’, ‘~’はエンコードしちゃいけないってこと
- curlPercentEncodeだと日本語はエンコードされないってこと
> char <- "\u3042" # a Japanese character which must be encoded > identical(charToRaw(char), charToRaw(curlPercentEncode(char))) TRUE
- idを使うんじゃなくてid_strを使うべきだということ(今だとJSONの変換の時点でオーバーフローしている)
> as.character(as.numeric("133169796689821690")) [1] "133169796689821696"
- 全てのas.POSIXctにtz=”UTC”を指定すべきこと
> as.POSIXct(1321716739, origin='1970-01-01') [1] "2011-11-19 15:32:19 JST" > as.POSIXct(1321716739, tz='UTC', origin='1970-01-01') [1] "2011-11-19 15:32:19 UTC" > as.integer(as.POSIXct(1321716739, origin='1970-01-01')) [1] 1321684339 > as.integer(as.POSIXct(1321716739, tz='UTC', origin='1970-01-01')) [1] 1321716739
- dmSendが全く機能していないこと
- twitterObj$toDataFrameにstringAsFactorsオプションをつけるべきこと
- 自分の環境だと Sys.setlocale(“LC_TIME”, “en_US.utf-8”) を実行しないと twitterDateToPOSIXが機能しないこと
- JSONIO 0.3を使わないとUnicode文字列が扱えないこと
を伝えました。た、たぶん伝わってるはず・・・