到位的问题取决于许多因素:
- 硬件
- 操作系统(及其配置)
- JVM实现
- 网络设备
- 服务器行为
第一个问题 - 差异应该如此显着吗?
取决于负载、池大小和网络,但它可能比每个方向上观察到的因子 2 多得多(有利于异步或线程解决方案)。根据您后来的评论,差异更多是因为不当行为,但为了争论,我将解释可能的情况。
Dedicated threads could be quite a burden. (Interrupt handling and thread scheduling is done by the operating system in case you are are using Oracle [HotSpot] JVM as these tasks are delegated.) The OS/system could become unresponsive if there are too many threads and thus slowing your batch processing (or other tasks). There are a lot of administrative tasks regarding thread management this is why thread (and connection) pooling is a thing. Although a good operating system should be able to handle a few thousand concurrent threads, there is always the chance that some limits or (kernel) event occur.
This is where pooling and async behaviour comes in handy. There is for example a pool of 10 phisical threads doing all the work. If something is blocked (waits for the server response in this case) it gets in the "Blocked" state (see image) and the following task gets the phisical thread to do some work. When a thread is notified (data arrived) it becomes "Runnable" (from which point the pooling mechanism is able to pick it up [this could be the OS or JVM implemented solution]). For further reading on the thread states I recommend W3Rescue. To understand the thread pooling better I recommend this baeldung article.
Second question - is something wrong with the async implementation? If not, what is the right approach to go about here?
The implementation is OK, there is no problem with it. The behaviour is just different from the threaded way. The main question in these cases are mostly what the SLA-s (service level agreements) are. If you are the only "customer of the service, then basically you have to decide between latency or throughput, but the decision will affect only you. Mostly this is not the case, so I would recommend some kind of pooling which is supported by the library you are using.
Third question - However I just noted that the time taken is roughly the same the moment you read the response stream as a string. I wonder why this is?
The message is most likely arrived completely in both cases (probably the response is not a stream just a few http package), but if you are reading the header only that does not need the response itself to be parsed and loaded on the CPU registers, thus reducing the latency of reading the actual data received. I think this is a cool represantation in latencies (source and source):
This came out as a quite long answer so TL.DR.: scaling is a really hardcore topic, it depends on a lot of things:
- hardware: number of phisical cores, multi-threading capacity, memory speed, network interface
- operating system (and its configuration): thread management, interruption handling
- JVM 实现:线程管理(内部或外包给 OS),更不用说 GC 和 JIT 配置
- 网络设备:一些限制来自给定 IP 的并发连接,一些池非
HTTPS
连接并充当代理
- 服务器行为:池化工作人员或按请求工作人员等
在您的情况下,服务器很可能是瓶颈,因为两种方法在更正的情况下给出了相同的结果(HttpResponse::getStatusLine().getStatusCode() and HttpURLConnection::getResponseCode()
)。要给出正确的答案,您应该使用JMeter或LoadRunner等工具测量您的服务器性能,然后相应地调整您的解决方案。这篇文章更多的是关于 DB 连接池,但这里的逻辑也适用。