如何从 HTML 表(例如,从市场数据 S&P 500)动态填充数据库?
我有雅虎帐户!金融。在帐户中,我可以查看 HTML 格式的财务数据。
我需要一个简单的工具来从 HTML 表中填充数据库 (Access)。我在哪里可以找到这样的工具?
您可以从 Yahoo 历史数据导出为 CSV,并将 Access 中的 csv 文件直接链接为 MS Access 表。http://office.microsoft.com/en-ca/access-help/import-or-link-to-data-in-a-text-file-HA001232227.aspx
如果您想处理 html 页面源代码,那么此链接可能会有所帮助。
http://www.access-programmers.co.uk/forums/showthread.php?p=1145646
ACE/Jet OLEDB 可用于直接从 HTML 文件导入数据。例如,给定一个现有的Access 表 [DataFromHtml]
ID LastName
-- --------
和一个包含表格的 HTML 文件
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
<title>
Test Data
</title>
</head>
<body>
<table>
<tr>
<th>
ID
</th>
<th>
LastName
</th>
</tr>
<tr>
<td>
1
</td>
<td>
Thompson
</td>
</tr>
<tr>
<td>
2
</td>
<td>
O'Rourke
</td>
</tr>
</table>
</body>
</html>
以下 VBA 代码将清除 Access 表 ( DELETE FROM
),然后将 HTML 表数据导入其中。
Sub ImportFromHtml()
Const LocalTableName = "DataFromHtml"
Dim con As Object, rstHtml As Object, fld As Object, _
cdb As DAO.Database, rstAccdb As DAO.Recordset, _
recCount As Long
Set con = CreateObject("ADODB.Connection")
con.Open _
"Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=C:\Users\Gord\Documents\table.htm;" & _
"Extended Properties=""HTML Import;HDR=YES;IMEX=1"";"
Set rstHtml = CreateObject("ADODB.Recordset")
rstHtml.Open "SELECT * FROM [Test Data]", con
Set cdb = CurrentDb
cdb.Execute "DELETE FROM [" & LocalTableName & "]", dbFailOnError
Set rstAccdb = cdb.OpenRecordset(LocalTableName, dbOpenTable)
recCount = 0
Do While Not rstHtml.EOF
recCount = recCount + 1
rstAccdb.AddNew
For Each fld In rstHtml.Fields
rstAccdb.Fields(Trim(fld.Name)).Value = Trim(fld.Value)
Next
Set fld = Nothing
rstAccdb.Update
rstHtml.MoveNext
Loop
rstAccdb.Close
Set rstAccdb = Nothing
Set cdb = Nothing
rstHtml.Close
Set rstHtml = Nothing
con.Close
Set con = Nothing
Debug.Print recCount & " record(s) imported"
End Sub
假设 Gord Thompsons 解决方案的 HTML 结构,有一个非常快速的 ADO 方法。
Public Function GetTitle(ByVal HtmlFile As String) As String
Dim DOM As Object
Set DOM = CreateObject("MSXML2.DOMDocument")
DOM.Load HtmlFile
GetTitle = DOM.getElementsByTagName("title")(0).Text
End Function
Public Sub Import(ByVal Filename As String, ByVal Tablename As String)
Dim SQL As String
Dim Title As String
On Error GoTo Import_Error
Title = GetTitle(Filename)
CurrentProject.Connection.Execute "DROP TABLE " & Tablename
SQL = "SELECT * INTO " & Tablename & _
" FROM [HTML Import;HDR=YES;IMEX=1;DATABASE=" & Filename & "].[" & Title & "]"
CurrentProject.Connection.Execute SQL
Exit Sub
Import_Error:
End Sub
因此,您想将 HTML 文件“C:\SomeFolder\MyFile.html”放入表“MyImport”中,请使用:
Import "C:\SomeFolder\MyFile.html", "MyImport"
另一个提示:如果 HTML 文件的标题包含特殊字符,例如 . 或 :,导入将失败。您必须尝试哪些特殊字符有问题,哪些没有。