我正在使用 ruby 解析 xml 记录。XML 文件具有以下数据结构:
<row Id="27" PostTypeId="2" ParentId="11" CreationDate="2008-08-01T12:17:19.357" Score="13" Body="<p>@jeff</p>

<p>IMHO yours seems a little long. However it does seem a lit
tle more robust with support for "yesterday" and "years". But in my experience when this is used the person is most likely to view the content in the first 30 days. It is only the really har
dcore people that come after that. So that is why I usually elect to keep this short and simple.</p>

<p>This is the method I am currently using on one of my websites. This only re
turns a relative day, hour, time. And then the user has to slap on "ago" in the output.</p>

<pre><code>public static string ToLongString(this TimeSpan time)<br&g
t;{<br> string output = String.Empty;<br><br> if (time.Days &gt; 0)<br> output += time.Days + " days ";<br><br> if ((time.Days == 0 || time.Days =
= 1) &amp;&amp; time.Hours &gt; 0)<br> output += time.Hours + " hr ";<br><br> if (time.Days == 0 &amp;&amp; time.Minutes &gt; 0)<br> outp
ut += time.Minutes + " min ";<br><br> if (output.Length == 0)<br> output += time.Seconds + " sec";<br><br> return output.Trim();<br>}<br>
</code></pre>" OwnerUserId="17" LastEditorUserId="17" LastEditorDisplayName="Nick Berardi" LastEditDate="2008-08-01T13:16:49.127" LastActivityDate="2008-08-01T13:16:49.127" CommentCount="1" CommunityO
wnedDate="2009-09-04T13:15:59.820" />
但是有些记录并没有包含所有元素
<row Id="29" PostTypeId="2" ParentId="13" CreationDate="2008-08-01T12:19:17.417" Score="18" Body="<p>There are no HTTP headers that will report the clients timezone so far although it has been suggested t
o include it in the HTTP specification.</p>

<p>If it was me, I would probably try to fetch the timezone using clientside JavaScript and then submit it to the server using Ajax or so
mething.</p>" OwnerUserId="19" LastActivityDate="2008-08-01T12:19:17.417" CommentCount="0" />
我的 ruby 解析会遍历这些 XML 记录并将它们插入 MySQL 数据库:
def on_start_element(element, attributes)
if element == 'row'
@post_st.execute(attributes['Id'], attributes['PostTypeId'], attributes['AcceptedAnswerId'], attributes['ParentId'], attributes['Score'], attributes['ViewCount'],
attributes['Body'], attributes['OwnerUserId'] == nil ? -1 : attributes['OwnerUserId'], attributes['LastEditorUserId'], attributes['LastEditorDisplayName'],
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
attributes['AnswerCount'] == nil ? 0 : attributes['AnswerCount'], attributes['CommentCount'] == nil ? 0 : attributes['CommentCount'],
attributes['FavoriteCount'] == nil ? 0 : attributes['FavoriteCount'], DateTime.parse(attributes['CreationDate']).to_time.strftime("%F %T"))
post_id = attributes['Id']
tags = attributes['Tags'] == nil ? '' : attributes['Tags']
tags.scan(/<(.*?)>/).each do |tag_name|
tag_id = insert_or_find_tag(tag_name[0])
@post_ot_tag_insert_st.execute(post_id, tag_id)
end
end
end
但是在根据我的数据库中插入的内容处理第二条记录期间(最后一条记录是行 id=27 的记录),我收到以下错误:
/format.rb:1031:in `dup': can't dup NilClass (TypeError)
我想知道它是否与丢失的元素有关,如果它丢失了我期望在数据库中出现的一些元素,我想知道我应该如何处理这个或设置为某种默认值。例如,如果它的缺失日期将日期设置为某个默认日期值。
这是抱怨的行:
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
我认为它在抱怨LastEditDate
?