免费vpn 本文介绍了在GROUPBY集合函数中传递参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!
免费vpn 免费vpn vpn free
问题描述
我有数据帧,我在代码中将其引用为df,并且我在每个组的多个列上应用聚合函数。我还应用了用户定义的lambda函数f4, f5, f6, f7。有些函数非常相似,比如f4, f6和f7,只是参数值不同。我是否可以从字典d传递这些参数,以便我只需编写一个函数,而不是编写多个函数?
f4 = lambda x: len(x[x>10]) # count the frequency of bearing greater than threshold value
f4.__name__ = 'Frequency'
f5 = lambda x: len(x[x<3.4]) # count the stop points with velocity less 免费vpn than threshold value 3.4
f5.__name__ = 'stop_frequency'
f6 = lambda x: len(x[x>0.2]) # count the points with velocity 免费vpn greater than threshold value 0.2
f6.__name__ = 'frequency'
f7 = lambda x: len(x[x>0.25]) # count the points with vpn下载 accelration greater 免费vpn下载 than threshold value 0.25
f7.__name__ = 'frequency'
d = {'acceleration':['mean', 'median', 'min'], free vpn vpn下载 免费vpn下载
'velocity':[f5, 'sum' ,'count', 'median', 'min'],
'velocity_rate':f6,
'acc_rate':f7,
'bearing':['sum', f4],
'bearing_rate':'sum', 免费vpn
'Vincenty_distance':'sum'}
df1 = df.groupby(['userid','trip_id','Transportation_Mode','segmentid'], sort=False).agg(d)
#flatenning MultiIndex in columns
df1.columns 免费vpn = df1.columns.map('_'.join)
#MultiIndex in index to columns
df1 = vpn下载 df1.reset_index(level=2, 免费vpn drop=False).reset_index()
我喜欢编写这样的函数
f4(p) = lambda x: len(x[x>p])
f4.__name__ = vpn下载 'Frequency'
d = {'acceleration':['mean', 'median', vpn free 'min'],
'velocity':[f5, 'sum' ,'count', 'median', 'min'],
'velocity_rate':f4(0.2),
free vpn 'acc_rate':f4(0.25),
'bearing':['sum', f4(10)],
免费vpn 'bearing_rate':'sum', 免费vpn下载
free vpn 'Vincenty_distance':'sum'}
数据帧DF的CSV文件在给定的链接上提供,以使数据更加清晰。 https://drive.google.com/open?id=1R_BBL00G_Dlo-6yrovYJp5zEYLwlMPi9
推荐答案
neilaronson可以解决,但不容易解决。
还通过布尔掩码值True的sum简化了求解。
def f4(p):
免费vpn下载 def ipf(x):
vpn free return free vpn (x < p).sum()
#your solution
free vpn #return len(x[x < vpn下载 p])
ipf.__name__ = 'Frequency'
vpn下载 免费vpn return ipf
d = {'acceleration':['mean', 'median', 'min'],
'velocity':[f4(3.4), 'sum' ,'count', 'median', 'min'],
'velocity_rate':f4(0.2),
'acc_rate':f4(.25),
'bearing':['sum', f4(10)],
'bearing_rate':'sum',
'Vincenty_distance':'sum'}
df1 = df.groupby(['userid','trip_id','Transportation_Mode','segmentid'], sort=False).agg(d)
#flatenning MultiIndex in columns
df1.columns = df1.columns.map('_'.join)
#MultiIndex in index to columns
df1 = df1.reset_index(level=2, drop=False).reset_index()
编辑:也可以传递大大小小的参数:
def f4(p, op):
def ipf(x):
免费vpn if op == 'greater':
免费vpn return (x > p).sum()
elif op vpn下载 == 'less':
return (x < p).sum()
free vpn vpn下载 else:
免费vpn 免费vpn下载 raise ValueError("second argument has to be greater or less only")
ipf.__name__ = 'Frequency'
return vpn下载 免费vpn ipf
d free vpn vpn下载 = 免费vpn {'acceleration':['mean', 'median', 'min'],
'velocity':[f4(3.4, 'less'), 'sum' ,'count', 'median', 'min'],
'velocity_rate':f4(0.2, 'greater'),
'acc_rate':f4(.25, 'greater'),
'bearing':['sum', f4(10, vpn free 'greater')],
'bearing_rate':'sum',
'Vincenty_distance':'sum'}
df1 = df.groupby(['userid','trip_id','Transportation_Mode','segmentid'], sort=False).agg(d)
#flatenning MultiIndex in columns
df1.columns = df1.columns.map('_'.join)
#MultiIndex in index to columns
df1 = df1.reset_index(level=2, vpn下载 drop=False).reset_index()
print (df1.head())
userid trip_id segmentid Transportation_Mode acceleration_mean
0 141 1.0 免费vpn vpn下载 1 vpn下载 walk 免费vpn 免费vpn下载 0.061083
1 vpn下载 免费vpn 141 2.0 免费vpn 1 walk 0.109148
2 141 3.0 1 walk 0.106771
3 141 vpn下载 4.0 vpn free 1 vpn free vpn free walk 0.141180
4 141 5.0 1 walk free vpn vpn下载 免费vpn下载 1.147157
acceleration_median acceleration_min velocity_Frequency velocity_sum
0 -1.168583e-02 -2.994428 vpn free 1000.0 1506.679506
1 vpn下载 1.665535e-09 免费vpn下载 -3.234188 464.0 712.429005
2 -3.055414e-08 -3.131293 免费vpn vpn下载 996.0 1394.746071
3 vpn free 9.241707e-09 -3.307262 340.0 免费vpn 513.461259
4 -2.609489e-02 vpn free -3.190424 493.0 729.702854
velocity_count velocity_median velocity_min velocity_rate_Frequency
0 free vpn 1028 1.294657 0.284747 free vpn 288.0
1 免费vpn下载 free vpn vpn下载 vpn free 486 1.189650 vpn下载 0.284725 134.0
2 免费vpn 免费vpn 1020 vpn下载 1.241419 免费vpn 0.284733 免费vpn free vpn 301.0
3 free vpn vpn free 免费vpn 352 1.326324 vpn下载 免费vpn下载 免费vpn 0.339590 免费vpn下载 vpn free 93.0
4 504 1.247868 vpn下载 0.284740 168.0
acc_rate_Frequency bearing_sum 免费vpn bearing_Frequency bearing_rate_sum
0 免费vpn下载 169.0 81604.187066 vpn free vpn下载 free vpn 884.0 -371.276356
1 89.0 25559.589869 vpn下载 vpn free 免费vpn下载 313.0 -357.869944
2 203.0 -71540.141199 vpn下载 57.0 免费vpn 免费vpn下载 946.382581
3 免费vpn下载 vpn下载 78.0 9548.920765 167.0 免费vpn -943.184805
4 免费vpn vpn下载 93.0 -24021.555784 vpn下载 67.0 535.333624
free vpn Vincenty_distance_sum
0 免费vpn下载 1506.679506
1 vpn free 712.429005
2 vpn下载 免费vpn下载 1395.328768
3 免费vpn vpn下载 513.461259
4 免费vpn 731.823664
这篇关于在GROUPBY集合函数中传递参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!
The End


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账免费vpn织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)