I posted a question before a week and the answer was simply (use join):
join <(sort file1) <(sort file2) >output
to join files that have something common which is usually the first field.
I have the following two files:
genes.txt
ENSG001 ENSG002
ENSG002 ENSG001
ENSG003 ENSG004
features.txt
ENSG001 400
ENSG002 350
ENSG003 210
ENSG004 100
I need to join these two files to be like this:
output.txt
ENSG001 400 ENSG002 350
ENSG002 350 ENSG001 400
ENSG003 210 ENSG004 100
I know the answer is in join command but I can't figure out how to join based on two fields. I tried
join -j 1 <(sort genes.txt) <(sort features.txt) >attempt1.txt
but the result will looks like this:
attempt1.txt
ENSG001 ENSG002 400
ENSG002 ENSG001 350
ENSG003 ENSG004 210
I then tried
join -j 2 <(sort -k 2 genes.txt) <(sort -k 2 features.txt) >attempt2.txt
attempt2.txt is empty
Does (join) have the ability to join two files based on two fields ? If no then how can I do it ?