Right now, I can get nearly what I want with a simple re.split("\n([^\s])", data)
as shown below, but the problem is that the resulting list contains the single non-whitespace character match as it's own item in the list. Example output below script. Notice how the "V" in "VLAN" has been captured into this as it's own item?
I'm wondering also if there's just a better way to do this, perhaps a library I can include that handles converting tabular data into a dictionary or something.
#!/usr/bin/python
import re
import sys
data = """
VLAN Name Status Ports
---- -------------------------------- --------- -------------------------------
1 default active Fa0/2, Fa0/3, Fa0/4, Fa0/5, Fa0/6, Fa0/7
Fa0/8, Fa0/9, Fa0/10, Fa0/11, Fa0/12
Fa0/13, Fa0/14, Fa0/15, Fa0/16, Fa0/17
Fa0/18, Fa0/19, Fa0/20, Fa0/21, Fa0/22
Fa0/23, Fa0/24, Gi0/2
1002 fddi-default act/unsup
1003 token-ring-default act/unsup
1004 fddinet-default act/unsup
1005 trnet-default act/unsup
"""
lines = re.split("\n([^\s])", data)
print lines
Output:
['', 'V', 'LAN Name Status Ports', '-', '--- -------------------------------- --------- -------------------------------', '1', ' default active Fa0/2, Fa0/3, Fa0/4, Fa0/5, Fa0/6, Fa0/7\n
Fa0/8, Fa0/9, Fa0/10, Fa0/11, Fa0/12\n
Fa0/13, Fa0/14, Fa0/15, Fa0/16, Fa0/17\n
Fa0/18, Fa0/19, Fa0/20, Fa0/21, Fa0/22\n
Fa0/23, Fa0/24, Gi0/2', '1', '002 fddi-default
act/unsup', '1', '003 token-ring-default act/unsup', '1', '004 fddinet-default act/unsup', '1', '005 trnet-default act/unsup\n']
Thanks!
Edit: (nm that doesn't work, sorry) but this whole thing still feels pretty hacky so I'd love to hear any alternative suggestions.lines = re.findall(".*[^\n\W]*", data)
seems like it's probably a better approach