0

我希望有人可以帮助给我一些想法或指出一些进一步的阅读材料,以便与 Mediapipe 一起使用 Iris .aar 创建自定义 Android 应用程序。我翻阅了官方的 MediaPipe 文档,但发现它是一个有点有限,现在我正在努力取得进展。我一直在尝试为虹膜模型添加预期的边包并尝试实时提取特定的地标坐标。

我的目标是创建一个开源注视方向驱动的文本到语音键盘,以实现可访问性目的,它使用修改后的 MediaPipe Iris 解决方案来推断用户的注视方向以控制应用程序,我非常感谢对此的任何帮助。

这是我到目前为止的发展计划和进展:

  1. 设置 Mediapipe 并从命令行构建示例DONE
  2. 为人脸检测和虹膜跟踪生成 .aars完成
  3. 设置 Android Studio 以构建 Mediapipe 应用完成
  4. 使用 .aar DONE构建和测试人脸检测示例应用
  5. 修改人脸检测示例以使用 Iris .aar IN PROGRESS
  6. 输出虹膜和眼睛边缘之间的坐标以及之间的距离,以实时估计方向。或者修改图表和计算器以尽可能为我推断并重建 .aar
  7. 将注视方向集成到应用程序中的控制方案中。
  8. 实施初始控制后扩展应用程序功能。

到目前为止,我已经使用以下构建文件生成了 Iris .aar,我构建的 .aar 是否包含子图和主图的计算器,还是我需要在我的 AAR 构建文件中添加其他内容?

.aar 构建文件:

load("//mediapipe/java/com/google/mediapipe:mediapipe_aar.bzl", "mediapipe_aar")
mediapipe_aar(
name = "mp_iris_tracking_aar",
calculators = ["//mediapipe/graphs/iris_tracking :iris_tracking_gpu_deps"],
)

目前,我有一个使用以下资产和上述 Iris .aar 设置的 android studio 项目。

Android Studio Assets:
iris_tracking_gpu.binarypb
face_landmark.tflite
iris_landmark.tflite
face_detection_front.tflite

现在,我只是尝试按原样构建它,以便更好地理解该过程并验证我的构建环境是否设置正确。我已经成功构建并测试了文档中列出的正确运行的人脸检测示例,但是在修改项目以使用 iris .aar 时,它可以正确构建,但在运行时崩溃,除了: Side Packet "focal_length_pixel" is required但未提供。

我尝试根据媒体管道代表中的 Iris 示例将焦距的代码添加到 onCreate,但我不知道如何修改它以使用 Iris .aar,是否有任何进一步的文档可以阅读指出我正确的方向?

我需要将此片段(我认为)集成到面部检测示例的修改代码中,但不确定如何。谢谢你的帮助 :)

    float focalLength = cameraHelper.getFocalLengthPixels();
    if (focalLength != Float.MIN_VALUE) {
    Packet focalLengthSidePacket = processor.getPacketCreator().createFloat32(focalLength);
    Map<String, Packet> inputSidePackets = new HashMap<>();
    inputSidePackets.put(FOCAL_LENGTH_STREAM_NAME, focalLengthSidePacket);
    processor.setInputSidePackets(inputSidePackets);
    }
    haveAddedSidePackets = true;
Modified Face Tracking AAR example:
package com.example.iristracking;

// Copyright 2019 The MediaPipe Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import android.graphics.SurfaceTexture;
import android.os.Bundle;
import android.util.Log;
import java.util.HashMap;
import java.util.Map;
import androidx.appcompat.app.AppCompatActivity;
import android.util.Size;
import android.view.SurfaceHolder;
import android.view.SurfaceView;
import android.view.View;
import android.view.ViewGroup;
import com.google.mediapipe.components.CameraHelper;
import com.google.mediapipe.components.CameraXPreviewHelper;
import com.google.mediapipe.components.ExternalTextureConverter;
import com.google.mediapipe.components.FrameProcessor;
import com.google.mediapipe.components.PermissionHelper;
import com.google.mediapipe.framework.AndroidAssetUtil;
import com.google.mediapipe.framework.Packet;
import com.google.mediapipe.glutil.EglManager;

/** Main activity of MediaPipe example apps. */
public class MainActivity extends AppCompatActivity {
private static final String TAG = "MainActivity";
private boolean haveAddedSidePackets = false;

private static final String FOCAL_LENGTH_STREAM_NAME = "focal_length_pixel";
private static final String OUTPUT_LANDMARKS_STREAM_NAME = "face_landmarks_with_iris";

private static final String BINARY_GRAPH_NAME = "iris_tracking_gpu.binarypb";
private static final String INPUT_VIDEO_STREAM_NAME = "input_video";
private static final String OUTPUT_VIDEO_STREAM_NAME = "output_video";
private static final CameraHelper.CameraFacing CAMERA_FACING = CameraHelper.CameraFacing.FRONT;

// Flips the camera-preview frames vertically before sending them into FrameProcessor to be
// processed in a MediaPipe graph, and flips the processed frames back when they are displayed.
// This is needed because OpenGL represents images assuming the image origin is at the bottom-left
// corner, whereas MediaPipe in general assumes the image origin is at top-left.
private static final boolean FLIP_FRAMES_VERTICALLY = true;

static {
    // Load all native libraries needed by the app.
    System.loadLibrary("mediapipe_jni");
    System.loadLibrary("opencv_java3");
}

// {@link SurfaceTexture} where the camera-preview frames can be accessed.
private SurfaceTexture previewFrameTexture;
// {@link SurfaceView} that displays the camera-preview frames processed by a MediaPipe graph.
private SurfaceView previewDisplayView;

// Creates and manages an {@link EGLContext}.
private EglManager eglManager;
// Sends camera-preview frames into a MediaPipe graph for processing, and displays the processed
// frames onto a {@link Surface}.
private FrameProcessor processor;
// Converts the GL_TEXTURE_EXTERNAL_OES texture from Android camera into a regular texture to be
// consumed by {@link FrameProcessor} and the underlying MediaPipe graph.
private ExternalTextureConverter converter;

// Handles camera access via the {@link CameraX} Jetpack support library.
private CameraXPreviewHelper cameraHelper;


@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);

    previewDisplayView = new SurfaceView(this);
    setupPreviewDisplayView();

    // Initialize asset manager so that MediaPipe native libraries can access the app assets, e.g.,
    // binary graphs.
    AndroidAssetUtil.initializeNativeAssetManager(this);

    eglManager = new EglManager(null);
    processor =
            new FrameProcessor(
                    this,
                    eglManager.getNativeContext(),
                    BINARY_GRAPH_NAME,
                    INPUT_VIDEO_STREAM_NAME,
                    OUTPUT_VIDEO_STREAM_NAME);
    processor.getVideoSurfaceOutput().setFlipY(FLIP_FRAMES_VERTICALLY);

    PermissionHelper.checkAndRequestCameraPermissions(this);


}

@Override
protected void onResume() {
    super.onResume();
    converter = new ExternalTextureConverter(eglManager.getContext());
    converter.setFlipY(FLIP_FRAMES_VERTICALLY);
    converter.setConsumer(processor);
    if (PermissionHelper.cameraPermissionsGranted(this)) {
        startCamera();
    }
}

@Override
protected void onPause() {
    super.onPause();
    converter.close();
}

@Override
public void onRequestPermissionsResult(
        int requestCode, String[] permissions, int[] grantResults) {
    super.onRequestPermissionsResult(requestCode, permissions, grantResults);
    PermissionHelper.onRequestPermissionsResult(requestCode, permissions, grantResults);
}

private void setupPreviewDisplayView() {
    previewDisplayView.setVisibility(View.GONE);
    ViewGroup viewGroup = findViewById(R.id.preview_display_layout);
    viewGroup.addView(previewDisplayView);

    previewDisplayView
            .getHolder()
            .addCallback(
                    new SurfaceHolder.Callback() {
                        @Override
                        public void surfaceCreated(SurfaceHolder holder) {
                            processor.getVideoSurfaceOutput().setSurface(holder.getSurface());
                        }

                        @Override
                        public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
                            // (Re-)Compute the ideal size of the camera-preview display (the area that the
                            // camera-preview frames get rendered onto, potentially with scaling and rotation)
                            // based on the size of the SurfaceView that contains the display.
                            Size viewSize = new Size(width, height);
                            Size displaySize = cameraHelper.computeDisplaySizeFromViewSize(viewSize);

                            // Connect the converter to the camera-preview frames as its input (via
                            // previewFrameTexture), and configure the output width and height as the computed
                            // display size.
                            converter.setSurfaceTextureAndAttachToGLContext(
                                    previewFrameTexture, displaySize.getWidth(), displaySize.getHeight());
                        }

                        @Override
                        public void surfaceDestroyed(SurfaceHolder holder) {
                            processor.getVideoSurfaceOutput().setSurface(null);
                        }
                    });
}

private void startCamera() {
    cameraHelper = new CameraXPreviewHelper();
    cameraHelper.setOnCameraStartedListener(
            surfaceTexture -> {
                previewFrameTexture = surfaceTexture;
                // Make the display view visible to start showing the preview. This triggers the
                // SurfaceHolder.Callback added to (the holder of) previewDisplayView.
                previewDisplayView.setVisibility(View.VISIBLE);
            });
    cameraHelper.startCamera(this, CAMERA_FACING, /*surfaceTexture=*/ null);

}
}
4

1 回答 1

0
override fun onResume() {
        super.onResume()
        converter = ExternalTextureConverter(eglManager?.context, NUM_BUFFERS)

        if (PermissionHelper.cameraPermissionsGranted(this)) {
            var rotation: Int = 0
            if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.R) {
                rotation = this.display!!.rotation
            } else {
                rotation = this.windowManager.defaultDisplay.rotation
            }

            converter!!.setRotation(rotation)
            converter!!.setFlipY(FLIP_FRAMES_VERTICALLY)

            startCamera(rotation)

            if (!haveAddedSidePackets) {
                val packetCreator = mediapipeFrameProcessor!!.getPacketCreator();
                val inputSidePackets = mutableMapOf<String, Packet>()

                focalLength = cameraHelper?.focalLengthPixels!!
                Log.i(TAG_MAIN, "OnStarted focalLength: ${cameraHelper?.focalLengthPixels!!}")
                inputSidePackets.put(
                    FOCAL_LENGTH_STREAM_NAME,
                    packetCreator.createFloat32(focalLength.width.toFloat())
                )
                mediapipeFrameProcessor!!.setInputSidePackets(inputSidePackets)
                haveAddedSidePackets = true

                val imageSize = cameraHelper!!.imageSize
                val calibrateMatrix = Matrix()
                calibrateMatrix.setValues(
                    floatArrayOf(
                        focalLength.width * 1.0f,
                        0.0f,
                        imageSize.width / 2.0f,
                        0.0f,
                        focalLength.height * 1.0f,
                        imageSize.height / 2.0f,
                        0.0f,
                        0.0f,
                        1.0f
                    )
                )
                val isInvert = calibrateMatrix.invert(matrixPixels2World)
                if (!isInvert) {
                    matrixPixels2World = Matrix()
                }
            }
            converter!!.setConsumer(mediapipeFrameProcessor)
        }
    }`
于 2021-06-05T08:41:23.003 回答