1

我有以下数据:

Trip      Start_Lat   Start_Long    End_lat      End_Long    Starting_point    Ending_point
Trip_1    56.5624     -85.56845       58.568       45.568         A               B
Trip_1    58.568       45.568       -200.568     -290.568         B               C 
Trip_1   -200.568     -290.568       56.5624     -85.56845        C               D
Trip_2    56.5624     -85.56845     -85.56845    -200.568         A               B
Trip_2   -85.56845    -200.568      -150.568     -190.568         B               C

我想找到电路是

   Circuity = Total Distance Travelled(Trip A+B+C+D) - Straight line (Trip A to D)
              -----------------------------------------------------------------------
                       Total Distance Traveled (Trip A+B+C+D)

我尝试了以下代码,

    df['Distance']= df['flight_distance'] = df.apply(lambda x: great_circle((x['start_lat'], x['start_long']), (x['end_lat'], x['end_long'])).km, axis = 1) 
    df['Total_Distance'] = ((df.groupby('Trip')['distance'].shift(2) +['distance'].shift(1) + df['distance']).abs())

你能帮我找到直线距离和电路吗?

4

1 回答 1

0

更新:

您可能希望首先将您的值转换为数字 dtypes:

df[['Start_Lat','Start_Long','End_lat','End_Long']] = \
df[['Start_Lat','Start_Long','End_lat','End_Long']].apply(pd.to_numeric, errors='coerce')

IIUC 你可以这样做:

# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
    """
    slightly modified version: of http://stackoverflow.com/a/29546836/2901002

    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees or in radians)

    All (lat, lon) coordinates must have numeric dtypes and be of equal length.

    """
    if to_radians:
        lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + \
        np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius * 2 * np.arcsin(np.sqrt(a))

def f(df):
    return 1 - haversine(df.iloc[0, 1], df.iloc[0, 2],
                         df.iloc[-1, 3], df.iloc[-1, 4]) \
               / \
               haversine(df['Start_Lat'], df['Start_Long'],
                         df['End_lat'], df['End_Long']).sum()

df.groupby('Trip').apply(f)

结果:

In [120]: df.groupby('Trip').apply(f)
Out[120]:
Trip
Trip_1    1.000000
Trip_2    0.499825
dtype: float64
于 2017-05-15T14:38:36.917 回答