2

我想从 B 列和 C 列的两个输入近似值中得到 A 列的名称

数据.csv

A;       B;        C
ALGOL;3.13614789;40.95564610
ALIOTH;12.90050072;55.95981118
ALKAID;13.79233003;49.31324779

以下代码适用于精确值:

fid = fopen('test.csv');
C = textscan(fid, '%s %s %s', 'Delimiter', ';');
fclose(fid);

val1 = input('Enter the first input: ', 's');
val2 = input('Enter the second input: ', 's');

if(find(ismember(C{2},val1)) == find(ismember(C{3},val2)))
    output = C{1}{find(ismember(C{2},val1))}
else
    disp('No match found!');
end

结果:

Enter the first input: 12.90050072
Enter the second input: 55.95981118

output =

ALIOTH

但是如何用 val1 和 val2 的近似值得到相同的结果呢?示例:val1= 13.001 和 val2 = 57.210 将给出 => "ALIOTH"

也许我必须使用 importdata 然后检查容差,但我不知道如何。有没有办法做到这一点?

4

2 回答 2

4

使用浮点数!

我建议您不要将数据读取为字符串,而是将数据读取为浮点数,

C = textscan(fid, '%s %f %f', 'Delimiter', ';', 'HeaderLines', 1);

这将使您能够执行数值比较。然后您可以计算搜索值与数据矩阵中每一行之间的距离(比如说欧几里得距离):

v = [val1, val2];
dist = sqrt(sum(bsxfun(@minus, [C{2:3}], v) .^ 2, 2));

然后您可以从中选择最小值dist(这将始终保证匹配):

tf = (dist - min(dist) < eps);

或选择低于某个阈值的值:

tol = 2; %// Tolerance of your choice
tf = (dist < tol);

生成的逻辑(布尔)向量tf在匹配行的位置应该有“1”。

您可以通过编写将其转换为第一列的实际值:

result = C{1}(tf)

概括

此解决方案可以推广到数据中任意数量的列 P。另外,假设您要v在数据中搜索几个不同的实例(假设这v是一个 M×P 矩阵,其中的每一行v都是要匹配的不同实例):

vv = permute(v, [3 2 1]);
dist = permute(sqrt(sum(bsxfun(@minus, [C{2:end}], vv) .^ 2, 2)), [1 3 2]);

同样,您可以选择最小值,确保匹配:

tf = (abs(bsxfun(@minus, dist, min(dist))) < eps);

或设置阈值:

tf = (dist < tol);

tf是一个逻辑 M×N 矩阵(N 是数据中的总行数),其中每一列表示匹配的数据行与v.

要将其转换为第一列的值,您必须将输出存储在元胞数组中:

result = arrayfun(@(x)C{1}(tf(:, x)), 1:size(tf, 2), 'UniformOutput', false);

例子

v = [13, 57.2; 13, 47]; %// Entries to search

vv = permute(v, [3 2 1]);
dist = permute(sqrt(sum(bsxfun(@minus, [C{2:end}], vv) .^ 2, 2)), [1 3 2])
tf = bsxfun(@minus, dist, min(dist)) < eps;

这导致:

tf =
     0     0
     1     0
     0     1

这意味着第一行v匹配第二个数据行,第二行v匹配第三个数据行。要从第一个数据列中查找匹配值,我们执行以下操作:

result = arrayfun(@(x)C{1}(tf(:, x)), 1:size(tf, 2), 'UniformOutput', false);

产生以下元胞数组:

result =
    { 'ALIOTH' }
    { 'ALKAID' }
于 2013-06-02T13:53:27.927 回答
1

假设您可以容忍一对数字与任一目标的距离,这是一种方法:

function testApproximate
    % define tolerance
    tolerance = 1;
    % open file
    fid = fopen('Data.csv');
    % read headers and discard
    textscan(fid, '%s %s %s', 1, 'delimiter', ';');
    % read rest of the data, combine columns 2 and 3 into a single matrix
    C = textscan(fid, '%s %f %f', 'delimiter', ';', 'CollectOutput', 1);
    % close file
    fclose(fid);

    % ask user for values
    val1 = input('Enter the first input: ');
    val2 = input('Enter the second input: ');

    % use Euclidean distance to find the closest point within tolerance 
    x = isApproximatelyEqual(C{2}, [val1, val2], tolerance);
    if x > 0
        output = C{1}{x}
    else
        disp('No match found!');
    end
end

function x = isApproximatelyEqual(vectors, member, tol)
    % set default tolerance if it is not provided
    if nargin < 3, tol = Inf; end
    % v is the difference between all points in vectors and our single
    % point in member
    v = vectors - repmat(member, size(vectors,1), 1);
    % find the minimum value and index of square root of sum of square of
    % all difference vectors
    [mn, x] = min(sqrt(diag(v * v')));
    % if minimum value does not meet tolerance, reset x
    if mn > tol
        x = 0;
    end
    % return x
    return
end

该方法使用欧几里得距离来查找最近点。如果您需要分别检查每个值以查看它们是否在公差范围内,请将isApproximatelyEqual上面的函数替换为:

function x = isApproximatelyEqual(vectors, member, tol)
    % set default tolerance if it is not provided
    if nargin < 3, tol = Inf; end
    % v is the difference between all points in vectors and our single
    % point in member
    v = vectors - repmat(member, size(vectors,1), 1);
    % return the first pair of points that matches the tolerance
    x = find(all(abs(v') < tol), 1);
    return
end
于 2013-06-02T07:15:04.703 回答