Background:
I am using istio 1.2.5
I have deployed istio using helm default profile from the istio documentation by enabling the tracing, kiali and logLevel to "debug".
My pods and service in istio system namespace looks like this:
(⎈ |cluster-dev:default)➜ istio-1.2.5 git:(master) ✗ k pods -n istio-system
NAME READY STATUS RESTARTS AGE
grafana-97fb6966d-cv5fq 1/1 Running 0 1d
istio-citadel-76f9586b8b-4bbcx 1/1 Running 0 1d
istio-galley-78f65c8469-v5cmn 1/1 Running 0 1d
istio-ingressgateway-5d5487c666-jjhb7 1/1 Running 0 1d
istio-pilot-65cb5648bf-4nfl7 2/2 Running 0 1d
istio-policy-8596cc6554-7sgzt 2/2 Running 0 1d
istio-sidecar-injector-76f487845d-ktf6p 1/1 Running 0 1d
istio-telemetry-5c6b6d59f6-lppqt 2/2 Running 0 1d
istio-tracing-595796cf54-zsnvj 1/1 Running 0 1d
kiali-55fcfc86cc-p2jrk 1/1 Running 0 1d
prometheus-5679cb4dcd-h7qsj 1/1 Running 0 1d
(⎈ |cluster-dev:default)➜ istio-1.2.5 git:(master) ✗ k svc -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 172.20.116.246 <none> 3000/TCP 1d
istio-citadel ClusterIP 172.20.177.97 <none> 8060/TCP,15014/TCP 1d
istio-galley ClusterIP 172.20.162.16 <none> 443/TCP,15014/TCP,9901/TCP 1d
istio-ingressgateway LoadBalancer 172.20.199.160 xxxxxxxxxxxxx... 15020:31334/TCP,80:31380/TCP,443:31390/TCP,31400:31400/TCP,15029:30200/TCP,15030:32111/TCP,15031:32286/TCP,15032:32720/TCP,15443:30857/TCP 1d
istio-pilot ClusterIP 172.20.137.21 <none> 15010/TCP,15011/TCP,8080/TCP,15014/TCP 1d
istio-policy ClusterIP 172.20.188.114 <none> 9091/TCP,15004/TCP,15014/TCP 1d
istio-sidecar-injector ClusterIP 172.20.47.238 <none> 443/TCP 1d
istio-telemetry ClusterIP 172.20.77.52 <none> 9091/TCP,15004/TCP,15014/TCP,42422/TCP 1d
jaeger-agent ClusterIP None <none> 5775/UDP,6831/UDP,6832/UDP 1d
jaeger-collector ClusterIP 172.20.225.255 <none> 14267/TCP,14268/TCP 1d
jaeger-query ClusterIP 172.20.181.245 <none> 16686/TCP 1d
kiali ClusterIP 172.20.72.227 <none> 20001/TCP 1d
prometheus ClusterIP 172.20.25.75 <none> 9090/TCP 1d
tracing ClusterIP 172.20.211.135 <none> 80/TCP 1d
zipkin ClusterIP 172.20.204.123 <none> 9411/TCP 1d
I have't use any egress gateway and my outboundTrafficPolicy mode is ALLOW_ANY. So, I am assuming I don't need any service entry.
I am using nginx ingress controller as a entrypoint to my cluster and haven't started istio ingress gateway.
Issue:
I have a micro service in my cluster that is hitting/reaching-out to an external URL(out of my cluster to legacy system) with a HTTP POST query. This system normally respond backs to my microservice in 25 seconds and have hard timeout on itself 0f 30 seconds.
When I am not using istio sidecar, my microservice is responding normally. But after deploying istio with sidecar I am getting 504 gateway everytime after 15 seconds.
Logs of microservice and well as istio-proxy:
Microservice logs without istio(search response took 21.957 Seconds in logs)
2019-09-06 19:42:20.113 INFO [xx-xx-adapter,9b32565791541300,9b32565791541300,false] 1 --- [or-http-epoll-4] c.s.t.s.impl.common.PrepareRequest : Start Preparing search request
2019-09-06 19:42:20.117 INFO [xx-xx-adapter,9b32565791541300,9b32565791541300,false] 1 --- [or-http-epoll-4] c.s.t.s.impl.common.PrepareRequest : Done Preparing search request
2019-09-06 19:42:42.162 INFO [xx-xx-adapter,9b32565791541300,9b32565791541300,false] 1 --- [or-http-epoll-8] c.s.t.service.impl.TVSearchServiceImpl : xxxx search response took 21.957 Seconds
2019-09-06 19:42:42.292 INFO [xx-xx-adapter,9b32565791541300,9b32565791541300,false] 1 --- [or-http-epoll-8] c.s.t.service.impl.common.Transformer : Doing transformation of supplier response into our response
2019-09-06 19:42:42.296 INFO [xx-xx-adapter,9b32565791541300,9b32565791541300,false] 1 --- [or-http-epoll-8] c.s.t.service.impl.common.Transformer : Transformer: Parsing completed in 3 mSeconds
Microservice logs with istio(response took 15.009 Seconds in logs)
2019-09-06 19:40:00.048 INFO [xxx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-3] c.s.t.s.impl.common.PrepareRequest : Start Preparing search request
2019-09-06 19:40:00.048 INFO [xxx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-3] c.s.t.s.impl.common.PrepareRequest : Done Preparing search request
2019-09-06 19:40:15.058 INFO [xx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-7] c.s.t.service.impl.xxxx : xxx Search Request {"rqst":{"Request":{"__type":"xx","CheckIn":"/Date(1569628800000+0000)/","CheckOut":"/Date(1569801600000+0000)/","DetailLevel":9,"ExcludeHotelDetails":false,"GeoLocationInfo":{"Latitude":25.204849,"Longitude":55.270782},"Nights":0,"RadiusInMeters":25000,"Rooms":[{"AdultsCount":2,"KidsAges":[2]}],"DesiredResultCurrency":"EUR","Residency":"GB","TimeoutSeconds":25,"ClientIP":"127.0.0.1"},"RequestType":1,"TypeOfService":2,"Credentials":{"UserName":"xxxx","Password":"xx"}}}
2019-09-06 19:40:15.058 ERROR [xx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-7] c.s.t.service.impl.xxxx : xxx Search request failed 504 GATEWAY_TIMEOUT
2019-09-06 19:40:15.058 INFO [xxx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-7] c.s.t.service.impl.xxxx : xx search response took 15.009 Seconds
2019-09-06 19:40:15.059 ERROR [xxx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-7] a.w.r.e.AbstractErrorWebExceptionHandler : [79d38e2f] 500 Server Error for HTTP POST "/search/geo-location"
java.lang.RuntimeException: Error occurred, We did not receive proper search response from xx please check internal logs for more info
<Java Stack trace >
2019-09-06 19:40:15.061 ERROR [xxx-xx-adapter,32c55821a507d6f3,32c55821a507d6f3,false] 1 --- [or-http-epoll-7] c.s.t.service.impl.xxxx : xxx search response upstream request timeout
2019-09-06 19:41:16.081 INFO [xxx-xx--adapter,,,] 1 --- [ Thread-22] o.s.s.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'executor'
Envoy sidecar proxy logs Masked
[2019-09-06T20:32:15.418Z] "POST /xxxxxx/xxxxx.svc/xx/ServiceRequest HTTP/1.1" 504 UT "-" "-" 517 24 14997 - "-" "ReactorNetty/0.8.10.RELEASE" "c273fac1-8xxa-xxxx-xxxx-xxxxxx" "testdomain.testurl.com" "40.67.217.71:80" PassthroughCluster - 1.7.17.71:80 1.20.21.25:42386 -
[2019-09-06 20:39:01.719][34][debug][router] [external/envoy/source/common/router/router.cc:332] [C57][S17104382791712695742] cluster 'PassthroughCluster' match for URL '/xxxxx/xxxx.svc/json/ServiceRequest'
[2019-09-06 20:39:01.719][34][debug][router] [external/envoy/source/common/router/router.cc:332] [C57][S17104382791712695742] cluster 'PassthroughCluster' match for URL '/xxxxx/xxxx.svc/json/ServiceRequest'
[2019-09-06 20:39:01.719][34][debug][upstream] [external/envoy/source/common/upstream/original_dst_cluster.cc:87] Created host 40.67.217.71:80.
[2019-09-06 20:39:01.719][34][debug][router] [external/envoy/source/common/router/router.cc:393] [C57][S17104382791712695742] router decoding headers:
':authority', 'x.x.com'
':path', '/xxxxx/xxxxx.svc/json/ServiceRequest'
':method', 'POST'
':scheme', 'http'
'user-agent', 'ReactorNetty/0.8.10.RELEASE'
'accept', 'application/json'
'accept-encoding', 'gzip, deflate'
'x-newrelic-transaction', 'PxRSBVVQXAdVUgNTUgcPUQUBFB8EBw8RVU4aB1wLB1YHAA8DAAQFWlNXB0NKQV5XCVVQAQcGFTs='
'x-newrelic-id', 'VgUDWFVaARADUFNWAgQHV1A='
'content-type', 'application/json;charset=UTF-8'
'content-length', '517'
'x-forwarded-proto', 'http'
'x-request-id', '750f4fdb-83f9-409c-9ecf-e0a1fdacbb65'
'x-istio-attributes', 'CloKCnNvdXJjZS51aWQSTBJKa3ViZXJuZXRlczovL2hvdGVscy10di1hZGFwdGVyLXNlcnZpY2UtZGVwbG95bWVudC03ZjQ0ZDljNjVjLWhweHo3LmRlZmF1bHQ='
'x-envoy-expected-rq-timeout-ms', '15000'
'x-b3-traceid', '971ac547c63fa66e'
'x-b3-spanid', '58c12e7da54ae50f'
'x-b3-parentspanid', 'dc7bda5b98d522bf'
'x-b3-sampled', '0'
[2019-09-06 20:39:01.719][34][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:88] creating a new connection
[2019-09-06 20:39:01.719][34][debug][client] [external/envoy/source/common/http/codec_client.cc:26] [C278] connecting
[2019-09-06 20:39:01.719][35][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:973] membership update for TLS cluster PassthroughCluster added 1 removed 0
[2019-09-06 20:39:01.719][28][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:973] membership update for TLS cluster PassthroughCluster added 1 removed 0
[2019-09-06 20:39:01.719][34][debug][connection] [external/envoy/source/common/network/connection_impl.cc:704] [C278] connecting to 40.67.217.71:80
[2019-09-06 20:39:01.720][35][debug][upstream] [external/envoy/source/common/upstream/original_dst_cluster.cc:41] Adding host 40.67.217.71:80.
[2019-09-06 20:39:01.720][28][debug][upstream] [external/envoy/source/common/upstream/original_dst_cluster.cc:41] Adding host 40.67.217.71:80.
[2019-09-06 20:39:01.720][34][debug][connection] [external/envoy/source/common/network/connection_impl.cc:713] [C278] connection in progress
[2019-09-06 20:39:01.720][34][debug][pool] [external/envoy/source/common/http/conn_pool_base.cc:20] queueing request due to no available connections
[2019-09-06 20:39:01.720][34][debug][filter] [src/envoy/http/mixer/filter.cc:102] Called Mixer::Filter : decodeData (517, false)
[2019-09-06 20:39:01.720][34][debug][http] [external/envoy/source/common/http/conn_manager_impl.cc:1079] [C57][S17104382791712695742] request end stream
[2019-09-06 20:39:01.720][34][debug][filter] [src/envoy/http/mixer/filter.cc:102] Called Mixer::Filter : decodeData (0, true)
[2019-09-06 20:39:01.720][34][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:973] membership update for TLS cluster PassthroughCluster added 1 removed 0
[2019-09-06 20:39:01.748][34][debug][connection] [external/envoy/source/common/network/connection_impl.cc:552] [C278] connected
[2019-09-06 20:39:01.748][34][debug][client] [external/envoy/source/common/http/codec_client.cc:64] [C278] connected
[2019-09-06 20:39:01.748][34][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:245] [C278] attaching to next request
[2019-09-06 20:39:01.748][34][debug][router] [external/envoy/source/common/router/router.cc:1210] [C57][S17104382791712695742] pool ready
[2019-09-06 20:39:02.431][35][debug][filter] [external/envoy/source/extensions/filters/listener/original_dst/original_dst.cc:18] original_dst: New connection accepted
[2019-09-06 20:39:02.431][35][debug][filter] [external/envoy/source/extensions/filters/listener/tls_inspector/tls_inspector.cc:72] tls inspector: new connection accepted
[2019-09-06 20:39:02.431][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:30] Called tcp filter: Filter
[2019-09-06 20:39:02.431][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:40] Called tcp filter: initializeReadFilterCallbacks
[2019-09-06 20:39:02.431][35][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:200] [C279] new tcp proxy session
[2019-09-06 20:39:02.431][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:135] [C279] Called tcp filter onNewConnection: remote 1.210.4.2:39482, local 1.10.1.5:80
[2019-09-06 20:39:02.431][35][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:343] [C279] Creating connection to cluster inbound|80|xx-xx-adapter-service|xx-xx-adapter-service.default.svc.cluster.local
[2019-09-06 20:39:02.431][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:80] creating a new connection
[2019-09-06 20:39:02.431][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:372] [C280] connecting
[2019-09-06 20:39:02.431][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:704] [C280] connecting to 127.0.0.1:80
[2019-09-06 20:39:02.431][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:713] [C280] connection in progress
[2019-09-06 20:39:02.431][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:106] queueing request due to no available connections
[2019-09-06 20:39:02.431][35][debug][main] [external/envoy/source/server/connection_handler_impl.cc:257] [C279] new connection
[2019-09-06 20:39:02.431][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:552] [C280] connected
[2019-09-06 20:39:02.431][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:293] [C280] assigning connection
[2019-09-06 20:39:02.431][35][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:542] TCP:onUpstreamEvent(), requestedServerName:
[2019-09-06 20:39:02.431][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:143] Called tcp filter completeCheck: OK
[2019-09-06 20:39:02.432][35][debug][filter] [src/istio/control/client_context_base.cc:140] Report attributes: attributes {
key: "connection.event"
value {
string_value: "open"
}
}
attributes {
key: "connection.id"
value {
string_value: "82e869af-aec6-406a-8a52-4168a19eb1f0-279"
[2019-09-06 20:39:02.432][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:102] [C279] Called tcp filter onRead bytes: 130
[2019-09-06 20:39:02.435][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:125] [C279] Called tcp filter onWrite bytes: 147
[2019-09-06 20:39:02.435][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:102] [C279] Called tcp filter onRead bytes: 0
[2019-09-06 20:39:02.436][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:125] [C279] Called tcp filter onWrite bytes: 0
[2019-09-06 20:39:02.436][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:520] [C280] remote close
[2019-09-06 20:39:02.436][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:190] [C280] closing socket: 0
[2019-09-06 20:39:02.436][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:121] [C280] client disconnected
[2019-09-06 20:39:02.436][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:101] [C279] closing data_to_write=0 type=0
[2019-09-06 20:39:02.436][35][debug][connection] [external/envoy/source/common/network/connection_impl.cc:190] [C279] closing socket: 1
[2019-09-06 20:39:02.436][35][debug][filter] [src/envoy/tcp/mixer/filter.cc:174] [C279] Called tcp filter onEvent: 1 upstream 127.0.0.1:80
[2019-09-06 20:39:02.436][35][debug][filter] [src/istio/control/client_context_base.cc:140] Report attributes: attributes {
key: "connection.duration"
value {
duration_value {
nanos: 4358000
}
}
}
attributes {
key: "connection.event"
value {
string_value: "close"
}
}
at
[2019-09-06 20:39:02.436][35][debug][main] [external/envoy/source/server/connection_handler_impl.cc:68] [C279] adding to cleanup list
[2019-09-06 20:39:02.436][35][debug][pool] [external/envoy/source/common/tcp/conn_pool.cc:246] [C280] connection destroyed
I tried creating a virtual service over my MS with timeout of 30seconds but no luck.
I am not sure what I am missing. Need Help.
Crux: Internet(or from pod console) --> micro-service --> micro-service calling 3rd part legacy URL internally in code and getting timeouts in exact 15 seconds every-time.
Editing and adding more details:
Just to add the delay, I tried "curl http://slowwly.robertomurray.co.uk/delay/17000/url/http://www.google.com
" (17000 = 17 seconds )from any of the microservice pod that have istio-proxy sidecar and always getting timeouts at 15 seconds.
I am not sure where to change this 15 seconds envoy setting.
After lot of hit and trial we got this working.
Reference: https://github.com/istio/istio/issues/16915#issuecomment-529210672
So for any outbound traffic no matter if you have egress gateway or not default timeout is 15 seconds. So, if you want to increase that you need to have a ServiceEntry and a VirtualService that defines the timeout.
ServiceEntry
VirtualService
It seems 15 seconds is a default timeout value. Whether it is Istio or Envoy which sets that, I have yet to read further.
But, there's a couple of reported issue such as #1888 (Istio 0.2.12 and Kubernetes 1.8.1) and #6860 which was discussed to be very similar to your issue.:
You can read through the discussion of 6860, but in summary, they have introduced a fix starting 1.0.0 so that the timeout can be set greater than 15 seconds. To do that, you need to write the timeout in your VirtualServices. For example, for the internal service set 60 seconds, for your external service set to 40 seconds so that there is enough allowance.
Please check and see if it helps.