基本信息
源码名称:多种风机功率曲线的过滤方法
源码大小:3.41M
文件格式:.7z
开发语言:Python
更新时间:2021-04-28
   友情提示:(无需注册或充值,赞助后即可获取资源下载链接)

     嘿,亲!知识可是无价之宝呢,但咱这精心整理的资料也耗费了不少心血呀。小小地破费一下,绝对物超所值哦!如有下载和支付问题,请联系我们QQ(微信同号):813200300

本次赞助数额为: 2 元 
   源码介绍
python实现多种风机数据清洗,绘制功率曲线,包括kmeans,DBSCAN,KernelDensity等方法

#分区间进行dbscan聚类 def dbscan_filter(df,eps=0.6,min_samples=1.5):
    x = df.wind_speed y = df.power
    ws_min = min(x)-0.1  ws_max = max(x)
    bin_num = int((ws_max-ws_min)/0.5) 1 # 风速以0.5为间隔  bins = [i*0.5 ws_min for i in range(bin_num 1)]
    s = pd.cut(x,bins,labels=list(range(bin_num))) # 划分区间  df['bin'] = s
    df_group = df.groupby('bin', sort=False) # 分组  norm_index = []
    abnorm_index = [] for _, data in df_group:
        data_filter = data.copy() if len(data_filter) == 1:
            norm_index  = data_filter.index.tolist() elif len(data_filter)>=2:
            ws = data['wind_speed'].tolist()
            pw = data['power'].tolist()
            cnt_raw = len(ws)
            raw = np.array([[ws[i], pw[i]] for i in range(0, cnt_raw)])
            db = DBSCAN(eps=eps, min_samples=min_samples).fit(raw)#eps为半径,min_samples为最小的样本数  labels = db.labels_
            data_ = data.copy()
            data_['cluster_db'] = labels # 在数据集最后一列加上经过DBSCAN聚类后的结果,-1为临界点或离群点  norm_index  = data_.loc[data_.cluster_db!=-1].index.tolist()
            abnorm_index  = data_.loc[data_.cluster_db==-1].index.tolist()
    ws_n = df.loc[norm_index,'wind_speed']
    pw_n = df.loc[norm_index,'power']
    ws_abn = df.loc[abnorm_index,'wind_speed']
    pw_abn = df.loc[abnorm_index,'power'] return ws_n,pw_n,ws_abn,pw_abn