我有一个完整的股票价格数据表。每行都有唯一的股票代码和日期组合。我通过获取包含每天每个股票价格数据的 CSV 文件来加载新数据。我知道 CSV 文件中有重复项。我只想添加尚未在我的数据表中的数据。最快的方法是什么?
我应该尝试添加每一行并捕获每个异常吗?或者,我是否应该通过读取我的数据表来比较每一行与我的数据表以查看该行是否已经存在?或者,还有其他选择吗?
附加信息
这就是我一直在做的事情。对于 CSV 文件中的每一行,我都会读取我的数据表以查看它是否已经存在。
Dim strURL As String
Dim strBuffer As String
strURL = "http://ichart.yahoo.com/table.csv?s=" & tickerValue
strBuffer = RequestWebData(strURL)
Dim sReader As New StringReader(strBuffer)
Dim List As New List(Of String)
Do While sReader.Peek >= 0
List.Add(sReader.ReadLine)
Loop
List.RemoveAt(0)
Dim lines As String() = List.ToArray
sReader.Close()
For Each line In lines
Dim checkDate = line.Split(",")(0).Trim()
Dim dr As OleDbDataReader
Dim cmd2 As New OleDb.OleDbCommand("SELECT * FROM " & tblName & " WHERE Ticker = ? AND [Date] = ?", con)
cmd2.Parameters.AddWithValue("?", tickerValue)
cmd2.Parameters.AddWithValue("?", checkDate)
dr = cmd2.ExecuteReader
If dr.Read() = 0 Then
Dim cmd3 As OleDbCommand = New OleDbCommand("INSERT INTO " & tblName & " (Ticker, [Date], [Open], High, Low, [Close], Volume, Adj_Close) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", con)
cmd3.Parameters.Add("@Ticker", OleDbType.VarChar).Value = tickerValue
cmd3.Parameters.Add("@[Date]", OleDbType.VarChar).Value = checkDate
cmd3.Parameters.Add("@[Open]", OleDbType.VarChar).Value = line.Split(",")(1).Trim
cmd3.Parameters.Add("@High", OleDbType.VarChar).Value = line.Split(",")(2).Trim
cmd3.Parameters.Add("@Low", OleDbType.VarChar).Value = line.Split(",")(3).Trim
cmd3.Parameters.Add("@[Close]", OleDbType.VarChar).Value = line.Split(",")(4).Trim
cmd3.Parameters.Add("@Volume", OleDbType.VarChar).Value = line.Split(",")(5).Trim
cmd3.Parameters.Add("@Adj_Close", OleDbType.VarChar).Value = line.Split(",")(6).Trim
cmd3.ExecuteNonQuery()
Else
End If
这就是我已经切换到的,它给出了这个异常:The changes you requested to the table were not successful because they would create duplicate values in the index, primary key, or relationship. Change the data in the field or fields that contain duplicate data, remove the index, or redefine the index to permit duplicate entries and try again.
我可以每次都捕获这个异常并忽略它,直到我遇到新的一行。
Dim strURL As String = "http://ichart.yahoo.com/table.csv?s=" & tickerValue
Debug.WriteLine(strURL)
Dim strBuffer As String = RequestWebData(strURL)
Using streamReader = New StringReader(strBuffer)
Using reader = New CsvReader(streamReader)
reader.ReadHeaderRecord()
While reader.HasMoreRecords
Dim dataRecord As DataRecord = reader.ReadDataRecord()
Dim cmd3 As OleDbCommand = New OleDbCommand("INSERT INTO " & tblName & " (Ticker, [Date], [Open], High, Low, [Close], Volume, Adj_Close) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", con)
cmd3.Parameters.Add("@Ticker", OleDbType.VarChar).Value = tickerValue
cmd3.Parameters.Add("@[Date]", OleDbType.VarChar).Value = dataRecord.Item("Date")
cmd3.Parameters.Add("@[Open]", OleDbType.VarChar).Value = dataRecord.Item("Open")
cmd3.Parameters.Add("@High", OleDbType.VarChar).Value = dataRecord.Item("High")
cmd3.Parameters.Add("@Low", OleDbType.VarChar).Value = dataRecord.Item("Low")
cmd3.Parameters.Add("@[Close]", OleDbType.VarChar).Value = dataRecord.Item("Close")
cmd3.Parameters.Add("@Volume", OleDbType.VarChar).Value = dataRecord.Item("Volume")
cmd3.Parameters.Add("@Adj_Close", OleDbType.VarChar).Value = dataRecord.Item("Adj Close")
cmd3.ExecuteNonQuery()
End While
End Using
End Using
我只想使用最有效的方法。
更新
根据下面的答案,这是我到目前为止的代码:
Dim strURL As String = "http://ichart.yahoo.com/table.csv?s=" & tickerValue
Dim strBuffer As String = RequestWebData(strURL)
Using streamReader = New StringReader(strBuffer)
Using reader = New CsvReader(streamReader)
' the CSV file has a header record, so we read that first
reader.ReadHeaderRecord()
While reader.HasMoreRecords
Dim dataRecord As DataRecord = reader.ReadDataRecord()
Dim cmd3 As OleDbCommand = New OleDbCommand("INSERT INTO " & tblName & "(Ticker, [Date], [Open], High, Low, [Close], Volume, Adj_Close) " & "SELECT ?, ?, ?, ?, ?, ?, ?, ? " & "FROM DUAL " & "WHERE NOT EXISTS (SELECT 1 FROM " & tblName & " WHERE Ticker = ? AND [Date] = ?)", con)
cmd3.Parameters.Add("@Ticker", OleDbType.VarChar).Value = tickerValue
cmd3.Parameters.Add("@[Date]", OleDbType.VarChar).Value = dataRecord.Item("Date")
cmd3.Parameters.Add("@[Open]", OleDbType.VarChar).Value = dataRecord.Item("Open")
cmd3.Parameters.Add("@High", OleDbType.VarChar).Value = dataRecord.Item("High")
cmd3.Parameters.Add("@Low", OleDbType.VarChar).Value = dataRecord.Item("Low")
cmd3.Parameters.Add("@[Close]", OleDbType.VarChar).Value = dataRecord.Item("Close")
cmd3.Parameters.Add("@Volume", OleDbType.VarChar).Value = dataRecord.Item("Volume")
cmd3.Parameters.Add("@Adj_Close", OleDbType.VarChar).Value = dataRecord.Item("Adj Close")
cmd3.Parameters.Add("@Ticker", OleDbType.VarChar).Value = tickerValue
cmd3.Parameters.Add("@[Date]", OleDbType.VarChar).Value = dataRecord.Item("Date")
cmd3.ExecuteNonQuery()
End While
End Using
End Using
它给了我这个错误Data type mismatch in criteria expression.