我正在用 Python 开发一个项目来分析 SSH。
目前我被困在两件事上:
- 确定连接的空闲时间(没有数据通过连接传输的时间百分比)
- 确定连接类型(shell、隧道、scp 等)-> 连接内的通道类型
如何解决这个问题?
我正在用 Python 开发一个项目来分析 SSH。
目前我被困在两件事上:
如何解决这个问题?
在利用特定领域的详细信息时,使用加密流量有时会泄露大量信息。值得回顾过去的研究以了解这些方法。特别是对于 SSH,我建议阅读 Dawn Song 关于从 SSH 会话中推断登录密码的论文。
Another example: Bro uses a heuristic discern successful from unsuccessful logins based on the number of bytes transferred at the beginning of the connection.
In general, I recommend recording traces of the activity you want to profile/classify later. This way, you have ground truth and can find out where SSH behaves differently from what you expect.
To determine the idle time of interactive sessions, you need to understand the noise, if any, that SSH injects during periods of no activity. Then you may create a time series of the number of bytes transferred and experiment with the time resolution to see what granularity models best your trace. Moreover, you can decompose the times series into two components, one being SSH protocol noise and one user activity.
这听起来像是一个经典的无监督学习问题:聚类,例如 k-means 或混合。提出正确的功能集可能需要一些研究。例如,如果隧道连接也是交互式的,那么从隧道中确定交互式会话可能会很困难。在您的模型中,您可以考虑大小增量,甚至包括更多上下文,例如踏脚石检测。