I need to implement a Python regular expression to search for a all occurrences A1a
or A_1_a
or A-1-a
or _A_1_a_
or _A1a
, where:
A
can be A to Z.1
can be 1 to 9.a
can be a to z.
Where there are only three characters letter number letter, separated by Underscores, Dashes or nothing. The case in the search string needs to be matched exactly.
The main problem I am having is that sometimes these three letter combinations are connected to other text by dashes and underscores. Also creating the same regular expression to search for A1a
, A-1-a
and A_1_a
.
Also I forgot to mention this is an XML file.
Thanks this found every occurrence of what I was looking for with a slight modification [-]?[A][-]?[1][-]?[a][-]?, but I need to have these be variables something like
[-]?[var_A][-]?[var_3][-]?[Var_a][-]?
would that be done like this
regex = r"[-]?[%s][-]?[%s][-]?[%s][-]?"
print re.findall(regex,var_A,var_Num,Var_a)
Or more like:
regex = ''.join(['r','\"','[-]?[',Var_X,'][-]?[',Var_Num,'][-]?[',Var_x,'][-]?','\"'])
print regex
for sstr in searchstrs:
matches = re.findall(regex, sstr, re.I)
But this isn't working
Sample Lines of the File: Before Running Script
<t:ION t:SA="BoolObj" t:H="2098947" t:P="2098944" t:N="AN7 Result" t:CI="Boolean_Register" t:L="A_3_a Fdr2" t:VS="true">
<t:ION t:SA="RegisterObj" t:H="20971785" t:P="20971776" t:N="ART1 Result 1" t:CI="NumericVariable_Register" t:L="A3a1 Status" t:VS="1">
<t:ION t:SA="ModuleObj" t:H="2100736" t:P="2097152" t:N="AND/OR 14" t:CI="AndOr_Module" t:L="A_3_a**_2 Energized from Norm" t:S="0" t:SC="5">
After Running Script What I am getting: (It's deleting the entire line and leaving only what is below)
- B_1_c
- B1c1
- B_1_c_2
What I Want to get:
<t:ION t:SA="BoolObj" t:H="2098947" t:P="2098944" t:N="AN7 Result" t:CI="Boolean_Register" t:L="B_1_c Fdr2" t:VS="true">
<t:ION t:SA="RegisterObj" t:H="20971785" t:P="20971776" t:N="ART1 Result 1" t:CI="NumericVariable_Register" t:L="B1c1 Status" t:VS="1">
<t:ION t:SA="ModuleObj" t:H="2100736" t:P="2097152" t:N="AND/OR 14" t:CI="AndOr_Module" t:L="B_1_c_2 Energized from Norm" t:S="0" t:SC="5">
import re
import os
search_file_name = 'Alarms Test.fwn'
pattern = 'A3a'
fileName, fileExtension = os.path.splitext(search_file_name)
newfilename = fileName + '_' + pattern + fileExtension
outfile = open(newfilename, 'wb')
def find_ext(text):
matches = re.findall(r'([_-]?[A{1}][_-]?[3{1}][_-]?[a{1}][_-]?)', text)
records = [m.replace('3', '1').replace('A', 'B').replace('a', 'c') for m in matches]
if matches:
outfile.writelines(records)
return 1
else:
outfile.writelines(text)
return 0
def main():
success = 0
count = 0
with open(search_file_name, 'rb') as searchfile:
try:
searchstrs = searchfile.readlines()
for s in searchstrs:
success = find_ext(s)
count = count + success
finally:
searchfile.close()
print count
if __name__ == "__main__":
main()