r - 在 Azure SQL db 上使用 RODBC::sqlSave 追加不可能

Question

因此，StackOverflow 上存在许多与 RODBC 问题有关的问题，但我还没有看到这种尝试附加到 AZURE SQL 数据库的特定变体。我真的认为应该有一个参数可以让您识别 KEY 而不是尝试加载...我会准备一个拉取请求，但 RODBC 在 github 上没有 dev 分支？无论如何，我会发布我的问题和我试图做的事情，然后是我讨厌的解决方法。

我在一个名为ActDF.new以下属性的表中有我的数据：

str(ActDF.new)
'data.frame':   52 obs. of  10 variables:
 $ Date          : Date, format: "2016-03-23" "2016-03-23" "2016-03-23" "2016-03-23" ...
 $ Project       : Factor w/ 1 level "x": 1 1 1 1 1 1 1 1 1 1 ...
 $ IndName       : Factor w/ 26 levels "x x...etc",..: 2 17 1 4 11 12 8 3 25 6 ...
 $ IndNum        : num  1 2 3 4 5 6 7 8 9 10 ...
 $ ProjectYear   : Factor w/ 2 levels "bla","blabla": 1 1 1 1 1 1 1 1 1 1 ...
 $ Value         : num  NA NA NA NA 4883 ...
 $ NoteTitle     : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
 $ NoteAnnotation: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
 $ ID            : num  1 1 1 1 1 1 1 1 1 1 ...
 $ CorpCode      : ch

我想将此信息附加到一个数据库中，并带有一个名为Actuals. 所以我尝试使用 RODBC::sqlSave 来完成此操作。下面找到逐个打击：

连接到数据库

d <- "Actuals RW"
p <- "xx"
u <- "xx"
channel <- odbcConnect(d,u,p)

获取数据库上的行数（知道 KEY 应该从哪里开始）

PresentNum <- sqlQuery(channel, 'SELECT count(*) FROM Actuals', rows_at_time = 5)
PresentNum <- PresentNum[1,1]

好的，让我们在我的 DF 中添加一个 ID，但是，还有大量的 NA 用于 Value ......我不需要这些，所以让我们转换为更好的 DF

## Initialize ID on this df
ActDF.new$ID <- 1

## Remove NAs from ActDF.new, and organize
toSave <- ActDF.new %>% filter(!is.na(Value)) %>%
  select (ID,Date,Project,FiscalYear=ProjectYear,IndNum,IndName,CorpCode,CurrentValue=Value,NoteTitle,NoteAnnotation)

## And now issue correct numbers to the ID
toSave$ID <- (PresentNum+1):(nrow(toSave)+PresentNum)

有很多空白值，所以让我们将它们转换为 NA（这是一种令人讨厌的方式......我知道）

toSave <- 
  apply(toSave, 2, function(x) gsub("^$|^ $", NA, x))  %>% as.data.frame()

## Now everything is a factor, convert to correct format
toSave$ID <- as.numeric(toSave$ID)
toSave$Date<- as.Date(toSave$Date)
toSave$Project<- as.character(toSave$Project)
toSave$FiscalYear<- as.character(toSave$FiscalYear)
toSave$IndNum<- as.character(toSave$IndNum)
toSave$IndName<- as.character(toSave$IndName)
toSave$CorpCode<- as.character(toSave$CorpCode)
toSave$CurrentValue<- as.numeric(toSave$CurrentValue)
toSave$NoteTitle<- as.character(toSave$NoteTitle)
toSave$NoteAnnotation <- as.character(toSave$NoteAnnotation)

###好的，我们准备好了！所以尝试追加！###

################# Test 1 TRY APPENDING DATA AS IS
sqlSave(channel, toSave, tablename = 'Actuals', append = T,
        rownames = F, colnames = F, verbose = T,
        safer = T, addPK = F, 
        fast = T, test = F)
### RETURNS ERROR: Cannot insert explicit value for identity column in table 'Actuals' when IDENTITY_INSERT is set to OFF.

呵呵……够公平的……

################# Test 2 TRY TO CHANGE THE IDENTITY_INSERT PROPERTY  

sqlQuery(channel, "Set IDENTITY_INSERT Actuals ON", errors = TRUE)
### RETURNS ERROR: Cannot find the object \"Actuals\" because it does not exist or you do not have permissions."

哦……但是……等等，什么？该表肯定存在，并且我拥有 RW 权限。所以也许 IDENTITY_INSERT 以某种方式有所不同......无论如何状态是什么？

sqlQuery(channel, "SELECT OBJECTPROPERTY(OBJECT_ID('Actuals'), 'TableHasIdentity')")
### RETURNS 1.

嗯。addPK不知道这意味着什么……让我们用不同的设置再试一次

#################  Test 3, try to use addPK = TRUE to see if it makes difference.
sqlSave(channel, toSave, tablename = 'Actuals', append = T,
        rownames = F, colnames = F, verbose = T,
        safer = T, addPK = T, 
        fast = T, test = F)
### RETURNS ERROR: Cannot find the object \"Actuals\" because it does not exist or you do not have permissions."

这一点也不烦人。OK FINE，让我们一起删除ID

################# Test 4, Try to remove the ID
NoID <- toSave[,-grep("ID",names(toSave))]

sqlSave(channel, NoID, tablename = 'Actuals', append = T,
        rownames = FALSE, colnames = FALSE, verbose = T,
        safer = TRUE, addPK = F, 
        fast = T, test = F)
### RETURNS ERROR: Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test,  : missing columns in 'data'

哦真的吗？缺少列？？？？？？？美好的

## So add back in a dummy column
NoID$dummy <- 0
sqlSave(channel, NoID, tablename = 'Actuals', append = T,
        rownames = FALSE, colnames = FALSE, verbose = T,
        safer = TRUE, addPK = F, 
        fast = T, test = F)
### RETURNS ERROR: Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test,  : missing columns in 'data'

设置fast=F返回错误：length of 'dimnames' [2] not equal to array extent

好的。ggsqlSave你赢了，我输了。这是我认为需要修改的地方...检查它正在构建的 SQL 查询，我们看到：

Query: INSERT INTO "Actuals" ( "ID", "Date", "Project", "FiscalYear", "IndNum", "CorpCode", "CurrentValue", "NoteTitle", "NoteAnnotation", "IndName" ) VALUES ( ?,?,?,?,?,?,?,?,? )

我认为需要一些参数，我可以简单地指定该查询不尝试附加到列ID......对吗？

我错过了什么吗？

score 1 · Accepted Answer

这是我的工作：

################# Test 5, Try issuing the append command manually:

Q <- "INSERT INTO \"Actuals\" ( \"Project\", \"FiscalYear\") VALUES ('test','hello');"
sqlQuery(channel, Q, errors = TRUE)

好的，这样就行了！（所以我知道这不是权限问题）。正是这种格式：它需要表/字段名称的双引号和数据的单引号。好的，现在让我们尝试将此逻辑应用于我们的真实数据：

## first drop the dummy again:
NoID <- NoID[,-grep("dummy",names(NoID))]

## Ensure field names are surrounded by a DOUBLE quote, values are surrounded by a SINGLE quote. 
## Separate out the Date field because `paste` converts it to character if it's in with the rest of the data.

Q <- paste(
  "INSERT INTO \"Actuals\"  ( \"Date\", \"Project\", \"FiscalYear\", \"IndNum\", \"IndName\", \"CorpCode\", \"CurrentValue\", \"NoteTitle\", \"NoteAnnotation\" )",
  " VALUES ( '", NoID[1,1], "','", paste(NoID[1,2:ncol(NoID)],collapse="','"),
  "')", sep="")

sqlQuery(channel, Q, errors = TRUE)

最后！！好的，这行得通。现在为所有 DF 做，但成对组合 >2 个字符向量很棘手......所以：

## first create a character vector for each row, with the quotation marks nicely blended.
crazyD <- ""
for(i in 1:ncol(NoID)){
  crazyD <- paste(crazyD,paste("'",NoID[,i],"'", sep=""),sep="")
} 
crazyD <- gsub("''","','",crazyD)

## And now combine that one with the titles
Q <- paste(
  "INSERT INTO \"Actuals\"  ( \"Date\", \"Project\", \"FiscalYear\", \"IndNum\", \"IndName\", \"CorpCode\", \"CurrentValue\", \"NoteTitle\", \"NoteAnnotation\" ) VALUES ( ",
  crazyD, ")", collapse="; ")

## And push that query into the server
sqlQuery(channel, Q, errors = TRUE)

我就是这样做的。我想，直到有人告诉我如何做得更好。在那之前，让我保持警惕的是：SQL Server 查询的最大大小？IN 子句？有没有更好的方法

r - 在 Azure SQL db 上使用 RODBC::sqlSave 追加不可能

1 回答 1

Related

Reference