0

我正在尝试将示例角度应用程序部署到 GKE。我创建了一个示例集群,在其中启用了云运行和 istio 服务

gcloud beta container clusters create new-cluster \
--addons=HorizontalPodAutoscaling,HttpLoadBalancing,Istio,CloudRun \
--machine-type=n1-standard-2 \
--cluster-version=latest \
--zone=us-east1-b \
--enable-stackdriver-kubernetes --enable-ip-alias \
--scopes cloud-platform --num-nodes 4  --disk-size "10"  --image-type "COS"

以下是我的 cloudbuild.yaml 文件步骤:

 # build the container image
  - name: gcr.io/cloud-builders/docker
    args: [ build, -t, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, . ]

  # push the container image to Container Registry
  - name: gcr.io/cloud-builders/docker
    args: [ push, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 ]

  # Deploy container image to Cloud Run
  - name: gcr.io/cloud-builders/gcloud
    args: [ beta, run, deploy, feedback-ui-deploy-anthos, --image, gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01, --platform, gke, --cluster, cloudrun-angular-cluster, --cluster-location, us-central1-a ]


images:

  - gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01

我已经为 gcloud prj 设置了环境变量。现在,当我尝试将其部署到上面创建的 gke 集群时,我总是会遇到修订不可用错误:

Deploying new service... Configuration "service-1" does not have any ready Revision.                                                                        
  - Creating Revision...                                                                                                                                      
  X Routing traffic... Configuration "service-1" does not have any ready Revision. 

这是我用来部署到云运行的命令

gcloud beta run deploy --platform gke --cluster new-cluster --image gcr.io/$GCLOUD_PROJECT/gcp-cloudrun-gke-angular:1.01 --cluster-location us-east1-b 

另一个完全托管的云运行完美无缺。但是当我部署到现有的 gke 集群时,我最终会遇到错误。我通读了文档,它说如果它是一项新服务,则会自动创建修订版,不知道为什么我的服务没有发生这种情况

编辑: 这是 kubectl describe 输出。我删除了所有集群并重新创建了一个新集群,但最终还是一样。

所以在描述服务时,这就是我得到的

注意:我使用默认命名空间。不确定它是否与这个问题有任何关系。

Status:
  Conditions:
    Last Transition Time:  2019-12-04T12:49:59Z
    Message:               Revision "gke-service-00001-pef" failed with message: Container failed with: nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
2019/12/04 12:49:40 [emerg] 1#1: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
.
    Reason:                      RevisionFailed
    Status:                      False
    Type:                        ConfigurationsReady
    Last Transition Time:        2019-12-04T12:49:59Z
    Message:                     Configuration "gke-service" does not have any ready Revision.
    Reason:                      RevisionMissing
    Status:                      False
    Type:                        Ready
    Last Transition Time:        2019-12-04T12:49:59Z
    Message:                     Configuration "gke-service" does not have any ready Revision.
    Reason:                      RevisionMissing
    Status:                      False
    Type:                        RoutesReady
  Latest Created Revision Name:  gke-service-00001-pef
  Observed Generation:           1
  URL:                           http://gke-service.default.example.com
Events:
  Type    Reason   Age                  From                Message
  ----    ------   ----                 ----                -------
  Normal  Created  2m21s                service-controller  Created Configuration "gke-service"
  Normal  Created  2m21s                service-controller  Created Route "gke-service"
  Normal  Updated  20s (x5 over 2m21s)  service-controller  Updated Service "gke-service"

由于我通过 nginx 公开了 angular index.html 文件,因此这是我的配置:

server {


  listen 8080 default_server;

  sendfile on;

  default_type application/octet-stream;

  gzip on;
  gzip_http_version 1.1;
  gzip_disable      "MSIE [1-6]\.";
  gzip_min_length   1100;
  gzip_vary         on;
  gzip_proxied      expired no-cache no-store private auth;
  gzip_types        text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;
  gzip_comp_level   9;


  root /usr/share/nginx/html;


  location / {
    try_files $uri $uri/ /index.html =404;
    #proxy_pass: "http://localhost:8080/AdTechUIContent"
    #uncomment to include naxsi rules
    #include /etc/nginx/naxsi.rules
  }

}

当我在本地构建 docker 映像并且我能够访问它时,这工作正常。以防万一,这是我的 docker 文件

FROM node:12.13-alpine as app-ui-builder

#Now install angular cli globally
RUN npm install -g @angular/cli@8.3.14
#RUN npm config set registry https://registry.cnpmjs.org
#Install git and openssh because alpine image doenst have git and all modules in npm has the dependicies which are all uploaded in git
#so to use them we need to be able git
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i

RUN ls && $(npm bin)/ng build --prod

FROM nginx:1.17.5-alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
COPY app-ui-nginx.conf /etc/nginx/conf.d
RUN rm -rf /usr/share/nginx/html/*
COPY --from=app-ui-builder /app/dist/app-ui /usr/share/nginx/html
RUN ls /usr/share/nginx/html
RUN chmod -R a+r /usr/share/nginx/html

EXPOSE 8080
#
CMD ["nginx", "-g", " daemon off;"]

@AhmetB 。你能告诉我为什么nginx在这里抛出错误吗

编辑: 我确实尝试使用带有部署和服务的普通 Kubectl 命令部署应用程序。它工作得很好。因此,即使可以找到该文件,也不确定使用 nginx 记录错误是否违反了哪个云运行合同

4

3 回答 3

1

我发现了这个问题。看起来应该在自定义文件夹中创建日志文件(错误和访问日志文件)以供云运行访问。在启动修订之前,云运行会检查这些文件夹是否可用。当我使用旧的 nginx 配置文件时,没有创建自定义文件夹。现在修改了nginx conf文件并部署它,它工作正常

创建了两个文件 nginx.conf

user nginx;
worker_processes  1;

error_log  /var/logs/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/logs/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

默认.conf

server {
    listen       8080;
    server_name  localhost;

    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }

    # redirect server error pages to the static page /50x.html
    #
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

还修改了dockerfile

FROM node:12.13-alpine as app-ui-builder

RUN npm install -g @angular/cli@8.3.14
RUN apk add --update git openssh
RUN mkdir ./app
COPY package*.json /app/
WORKDIR ./app
COPY . .
RUN npm cache clear --force && npm i

 RUN ls && $(npm bin)/ng build --prod

FROM nginx:alpine AS nginx-builder
RUN apk update && apk add ca-certificates && rm -rf /var/cache/apk/*
#RUN rm -rf /etc/nginx/conf.d/*
RUN mkdir /var/logs
RUN mkdir /var/logs/nginx
COPY ./docker/nginx.conf /etc/nginx/
## Copy a new configuration file setting listen port to 8080
COPY ./docker/default.conf /etc/nginx/conf.d/
RUN rm -rf /usr/share/nginx/html/*
#
COPY --from=app-ui-builder /app/dist/app-ui
/usr/share/nginx/html
EXPOSE 8080
CMD ["nginx", "-g", " daemon off;"]

通过这个中等帖子找到它

于 2019-12-04T16:54:22.877 回答
1

部署新服务...配置“service-1”没有任何现成的修订版。

此错误意味着它已部署但由于某种原因 pod 崩溃或未调度。这可能由于各种原因而发生,例如节点上没有足够的 CPU/内存、无法从 GCR 中提取图像或应用程序正在崩溃循环。

查看应用程序的“kubectl logs”和“kubectl describe”输出。尝试:

  • kubectl 获取 ksvc
  • kubectl 获取 pod
  • kubectl 描述 ksvc 名称
  • kubectl 日志 NAME -c 用户容器
于 2019-11-29T21:07:50.230 回答
0

你的集群是否有任何基于角色的访问控制存储权限。我还建议您验证部署或 Cloud Run for Anthos所需的权限

检查您是否有存储权限范围4

于 2019-11-29T19:32:05.557 回答