.net c#检测数据的重大变化

Detecting significant changes in data
2020-10-25
  •  译文(汉语)
  •  原文(英语)

我有一个图形输入,其中X轴是时间(前进).Y轴通常是稳定的,但在不同点有较大的下降和上升(下面标记为红色箭头)

检测数据的重大变化

视觉上很明显,但是如何从代码中有效地检测到这一点?我不确定应该使用哪种算法,但我想使其尽可能简单.

速聊1:
只需选择合适的N,即可跟踪最近N个样本的值增量.
速聊2:
如果N点的平均斜率>某个阈值,则其"显着"差异.
解决过程1

一种简单的方法是计算每两个相邻样本之间的差异,例如diff = abs(y [x point 1]-y [x point 0])并计算所有差异的标准偏差.这将为您按顺序排列差异,还有助于消除仅采样最大差异值时得到的随机噪声.

如果您的上/下值超过几个x周期(例如,每分钟绘制温度),则计算N个样本的差异,取N个样本的最大值和最小值.如果希望将5个样本作为检测周期,则获取样本0、1、2、3、4并提取最小/最大,将其用于比较.对样品1,2,3,4,5重复上述操作,依此类推.您可能需要使用它,因为太多示例开始影响stddev.

另一种方法是通过子采样并选择有趣的斜率和长度来计算图表上/下部分的斜率.尽管这对于自动检测可能更准确,但要深入描述该算法要困难得多.

我已经研究过类似的问题,并构建了图表分类程序,但是我真的很喜欢该领域的研究参考.

当您进行此操作时,您可能还想查看运筹学中的"控制图",他们根据您的图表确定了几种可能也值得检测的模式.

I have a graph input where the X axis is time (going forwards). The Y axis is generally stable but has large drops and raises at different points (marked as the red arrows below)

Detecting significant changes in data

Visually it's obvious but how do I efficiently detect this from within code? I'm not sure which algorithms I should be using but I would like to keep it as simple as possible.

Talk1:
Just track the value-delta over the last N samples, with appropriate choice of N.
Talk2:
If the average slope of N points is > some threshold, then its a "significant" difference.
Solutions1

A simple way is to calculate the difference between every two neighbouring samples, eg diff= abs(y[x point 1] - y[x point 0]) and calculate the standard deviation for all the differences. This will rank the differences in order for you and also help eliminate random noise which you get if you just sample largest diff values.

If your up/down values are over several x periods ( eg temp plotted every minute ), then calculate the diff over N samples, taking the max and min from the N samples. If you want 5 samples to be the detection period, then get samples 0,1,2,3,4 and extract min/max, use those for diff. Repeat for samples 1,2,3,4,5 and so on. You may need to play with this as too many samples starts affecting stddev.

An alternative method is to calculate the slope of up/down parts of the chart by subsampling and selecting slopes and lengths that are interesting. While this can be more accurate for automated detection it is much harder to describe the algorithm in depth.

I've worked on similar issues and built a chart categoriser, but would really love references to research in this area.

When you get this going, you may also want to look at 'control charts' from operations research, they identify several patterns that might also be worth detecting, depending on what your charts are of.

转载于:https://stackoverflow.com/questions/30424953/detecting-significant-changes-in-data

本人是.net程序员,因为英语不行,使用工具翻译,希望对有需要的人有所帮助
如果本文质量不好,还请谅解,毕竟这些操作还是比较费时的,英语较好的可以看原文

留言回复
我们只提供高质量资源,素材,源码,坚持 下了就能用 原则,让客户花了钱觉得值
上班时间 : 周一至周五9:00-17:30 期待您的加入