1

I would like to know is there any better way to perform element wise division operator in python. The code below suppose to perform division A1 with B1 row and A2 with B2 rows therefore my expected output is only two rows. However the division part is A1 with B1, A1 with B2, A2 with B1 and A2 with B2. Can anyone help me?

The binary file is for A,C,G,T representations using 1000,0100,0010,0001. Division file has four columns each each A, C, G, T and therefore the values obtained earlier must divide accordingly.

Code

import numpy as np
from numpy import genfromtxt
import csv
csvfile = open('output.csv', 'wb')
writer = csv.writer(csvfile)

#open csv file into arrays
with open('binary.csv') as actg:
    actg=actg.readlines()
    with open('single.csv') as single:
        single=single.readlines()
        with open('division.csv') as division:
            division=division.readlines()

            # Converting binary line and single line into 3 rows and 4 columns 
            # binary values using reshape
            for line in actg:
                myarray = np.fromstring(line, dtype=float, sep=',')                
                myarray = myarray.reshape((-1, 3, 4))
                for line2 in single:                    
                    single1 = np.fromstring(line2, dtype=float, sep=',')
                    single1 = single1.reshape((-1, 4))
                    # This division is in 2 rows and 4 column: first column 
                    # represents 1000, 2nd-0100, 3rd-0010, 4th-0001 in the
                    # binary.csv. Therefore the division part where 1000's
                    # value should be divided by 1st column, 0010 should be
                    # divided by 3rd column value
                    for line1 in division:
                        division1 = np.fromstring(line1, dtype=float, sep=',')
                        m=np.asmatrix(division1)
                        m=np.array(m)
                        res2 = (single1[np.newaxis,:,:] / m[:,np.newaxis,:] * myarray).sum(axis=-1)                        
                        print(res2)
                        writer.writerow(res2)


csvfile.close()

binary.csv

0,1,0,0,1,0,0,0,0,0,0,1
0,0,1,0,1,0,0,0,1,0,0,0

single.csv:

0.28,0.22,0.23,0.27,0.12,0.29,0.34,0.21,0.44,0.56,0.51,0.65

division.csv

0.4,0.5,0.7,0.1
0.2,0.8,0.9,0.3

Expected output

 0.44,0.3,6.5
 0.26,0.6,2.2

Actual output

0.44,0.3,6.5
0.275,0.6,2.16666667
0.32857143,0.3,1.1       
0.25555556,0.6,2.2       

Explanation on the error

Let division file as follows:

A,B,C,D
E,F,G,H

Let after single and binary computation result as follows:

1,3,4
2,2,1

Let the number 1,2,3,4 is assigned to the location A,B,C,D and next row E,F,G,H

1/A,3/C,4/D
2/F,2/F,1/E

where 1 divided by A, 3 divided by C and so on. Basically this is what the code can do. Unfortunately the division part it happened to be like what described earlier. 221 operates with BBC and 134 operates with EGH therefore the output has 4 rows which is not what I want.

4

2 回答 2

1

I don't know if this is what you are looking for, but here is a short way to get what (I think) you want:

import numpy as np

binary = np.genfromtxt('binary.csv', delimiter = ',').reshape((2, 3, 4))
single = np.genfromtxt('single.csv', delimiter = ',').reshape((1, 3, 4))
divisi = np.genfromtxt('division.csv', delimiter = ',').reshape((2, 1, 4))

print(np.sum(single / divisi * binary, axis = -1))

Output:

[[ 0.44        0.3         6.5       ]
 [ 0.25555556  0.6         2.2       ]]
于 2016-02-24T10:36:09.467 回答
1

The output of your program looks kind of like this:

myarray
[ 0.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.  1.]

[[[ 0.  1.  0.  0.]
  [ 1.  0.  0.  0.]
  [ 0.  0.  0.  1.]]]

single1
[ 0.28  0.22  0.23  0.27  0.12  0.29  0.34  0.21  0.44  0.56  0.51  0.65]

[[ 0.28  0.22  0.23  0.27]
 [ 0.12  0.29  0.34  0.21]
 [ 0.44  0.56  0.51  0.65]]

    division
    [ 0.4  0.5  0.7  0.1]    
    m
    [[ 0.4  0.5  0.7  0.1]]    
    res2
    [[ 0.44  0.3   6.5 ]]

    division
    [ 0.2  0.8  0.9  0.3]    
    m
    [[ 0.2  0.8  0.9  0.3]]        
    res2
    [[ 0.275       0.6         2.16666667]]

myarray
[ 0.  0.  1.  0.  1.  0.  0.  0.  1.  0.  0.  0.]

[[[ 0.  0.  1.  0.]
  [ 1.  0.  0.  0.]
  [ 1.  0.  0.  0.]]]


single1
[ 0.28  0.22  0.23  0.27  0.12  0.29  0.34  0.21  0.44  0.56  0.51  0.65]

[[ 0.28  0.22  0.23  0.27]
 [ 0.12  0.29  0.34  0.21]
 [ 0.44  0.56  0.51  0.65]]

    division
    [ 0.4  0.5  0.7  0.1]
    m
    [[ 0.4  0.5  0.7  0.1]]
    res2
    [[ 0.32857143  0.3         1.1       ]]

    division
    [ 0.2  0.8  0.9  0.3]
    m
    [[ 0.2  0.8  0.9  0.3]]
    res2
    [[ 0.25555556  0.6         2.2       ]]

So, with that in mind, it looks like your last two lines of the output, the one's you did not expect are caused by the second line in binary.csv. So don't use that line in your calculations if you don't want 4 line in your result.

于 2016-02-24T10:40:32.540 回答