3

我有一个包含公司名称的数据字段,例如

company <- c("Microsoft", "Apple", "Cloudera", "Ford")
> company

  Company
1 Microsoft
2 Apple
3 Cloudera
4 Ford

等等。

该软件包tm.plugin.webmining允许您从 Yahoo! 查询数据!基于股票代码的财务:

require(tm.plugin.webmining)
results <- WebCorpus(YahooFinanceSource("MSFT")) 

我错过了中间步骤。如何根据公司名称以编程方式查询票号?

4

1 回答 1

6

我无法用这个tm.plugin.webmining包做到这一点,但我想出了一个粗略的解决方案——从这个网络文件中提取和解析数据:ftp: //ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt我说粗略是因为出于某种原因,我的电话httr::content(httr::GET(...))并非每次都有效——我认为这与网址的类型有关ftp://(它在我的 Linux 上似乎比我的 Mac 上工作得更好,但这可能无关紧要。无论如何,这就是我得到的:感谢@thelatemail 的评论,这似乎工作得更加顺利:

library(quantmod) ## optional
symbolData <- read.csv(
  "ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt",
  sep="|")
##
> head(symbolData,10)
   Symbol                                                   Security.Name Market.Category Test.Issue Financial.Status Round.Lot.Size
1    AAIT iShares MSCI All Country Asia Information Technology Index Fund               G          N                N            100
2     AAL                    American Airlines Group, Inc. - Common Stock               Q          N                N            100
3    AAME                    Atlantic American Corporation - Common Stock               G          N                N            100
4    AAOI                    Applied Optoelectronics, Inc. - Common Stock               G          N                N            100
5    AAON                                       AAON, Inc. - Common Stock               Q          N                N            100
6    AAPL                                       Apple Inc. - Common Stock               Q          N                N            100
7    AAVL                  Avalanche Biotechnologies, Inc. - Common Stock               G          N                N            100
8    AAWW                     Atlas Air Worldwide Holdings - Common Stock               Q          N                N            100
9    AAXJ               iShares MSCI All Country Asia ex Japan Index Fund               G          N                N            100
10   ABAC                        Aoxin Tianli Group, Inc. - Common Shares               S          N                N            100

编辑: 根据@GSee 的建议,获取源数据的一种(可能)更可靠的方法是使用stockSymbols()包中的函数TTR

> symbolData2 <- stockSymbols(exchange="NASDAQ")
Fetching NASDAQ symbols...
> ##
> head(symbolData2)
  Symbol                                                           Name LastSale    MarketCap IPOyear         Sector
1   AAIT iShares MSCI All Country Asia Information Technology Index Fun   34.556      6911200      NA           <NA>
2    AAL                                  American Airlines Group, Inc.   40.500  29164164453      NA Transportation
3   AAME                                  Atlantic American Corporation    4.020     83238028      NA        Finance
4   AAOI                                  Applied Optoelectronics, Inc.   20.510    303653114    2013     Technology
5   AAON                                                     AAON, Inc.   18.420   1013324613      NA  Capital Goods
6   AAPL                                                     Apple Inc.  103.300 618546661100    1980     Technology
                         Industry Exchange
1                            <NA>   NASDAQ
2   Air Freight/Delivery Services   NASDAQ
3                  Life Insurance   NASDAQ
4                  Semiconductors   NASDAQ
5 Industrial Machinery/Components   NASDAQ
6          Computer Manufacturing   NASDAQ

我不知道您是否只是想从名称中获取股票代码,但如果您也在寻找实际的股价信息,您可以执行以下操作:

namedStock <- function(name="Microsoft",
                       start=Sys.Date()-365,
                       end=Sys.Date()-1){
  ticker <- symbolData[agrep(name,symbolData[,2]),1]
  getSymbols(
    Symbols=ticker,
    src="yahoo",
    env=.GlobalEnv,
    from=start,to=end)
}
##
## an xts object named MSFT will be added to
## the global environment, no need to assign
## to an object
namedStock()
##
> str(MSFT)
An ‘xts’ object on 2013-09-03/2014-08-29 containing:
  Data: num [1:251, 1:6] 31.8 31.4 31.1 31.3 31.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:6] "MSFT.Open" "MSFT.High" "MSFT.Low" "MSFT.Close" ...
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
List of 2
 $ src    : chr "yahoo"
 $ updated: POSIXct[1:1], format: "2014-09-02 21:51:22.792"
> chartSeries(MSFT)

在此处输入图像描述

所以就像我说的,这不是最干净的解决方案,但希望它可以帮助你。另请注意,我的数据源是拉动在纳斯达克交易的公司(这是大多数主要公司),但您可以轻松地将其与其他来源结合起来。

于 2014-09-03T02:07:06.893 回答