Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i cant send landmarks to model with correct sort. also getting wrong result from model #5813

Open
sahmtzdmr opened this issue Jan 9, 2025 · 0 comments
Assignees
Labels
platform:android Issues with Android as Platform task:gesture recognition Issues related to hand gesture recognition: Identify and recognize hand gestures task:image classification Issues related to Image Classification: Identify content in images and video type:support General questions

Comments

@sahmtzdmr
Copy link

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Android

MediaPipe Tasks SDK version

No response

Task name (e.g. Image classification, Gesture recognition etc.)

Image classification and Gesture recognition

Programming Language and version (e.g. C++, Python, Java)

Java

Describe the actual behavior

wrong results from model and cant sort landmarks

Describe the expected behaviour

sorting landmarks correctly

Standalone code/steps you may have used to try to get what you need

.

Other info / Complete Logs

i should pass landmarks by sorting pose,left hand, right hand, face. but i cant pass thoose like that. Every time sorting pose, right, left, face. So i cant get correct result from model. when i holding camera to saying "devil" with sign language. model sending to me result as thursday. here is codes. public class holistic_activity extends AppCompatActivity {

    private static final String TAG = "MainActivity";
    private Interpreter tfliteInterpreter;
    private static final boolean FLIP_FRAMES_VERTICALLY = true;

    private static final int MIN_LANDMARK_COUNT = 20; // Her kategori için minimum landmark sayısı
    private static final int FRAME_LANDMARKS_COMPLETE = 4; // pose, left hand, right hand, face
    private static final int TOTAL_LANDMARK_COUNT = 543; // 33 (pose) + 21 (left) + 21 (right) + 468 (face)
    private static final int COORDS_PER_LANDMARK = 4;  // x, y, z, visibility
    private static final int LANDMARK_HISTORY_SIZE = 30;     // Zaman boyutu (temporal)
    private static final int FEATURES_PER_FRAME = 1662;      // Her frame için feature sayısı
    private static final int MODEL_INPUT_SIZE = LANDMARK_HISTORY_SIZE * FEATURES_PER_FRAME;
    private static final int FEATURE_COUNT = 1662;
    private static final int POSE_LANDMARK_COUNT = 33;
    private static final int HAND_LANDMARK_COUNT = 21;
    private static final int FACE_LANDMARK_COUNT = 468;

    private Queue<NormalizedLandmarkList> poseLandmarksQueue = new LinkedList<>();
    private Queue<NormalizedLandmarkList> leftHandLandmarksQueue = new LinkedList<>();
    private Queue<NormalizedLandmarkList> rightHandLandmarksQueue = new LinkedList<>();
    private Queue<NormalizedLandmarkList> faceLandmarksQueue = new LinkedList<>();
    private int currentFrameLandmarkCount = 0;
    private Object landmarkLock = new Object();

    private NormalizedLandmarkList currentPoseLandmarks = null;
    private NormalizedLandmarkList currentLeftHandLandmarks = null;
    private NormalizedLandmarkList currentRightHandLandmarks = null;
    private NormalizedLandmarkList currentFaceLandmarks = null;


    static {
        System.loadLibrary("mediapipe_jni");
        try {
            System.loadLibrary("opencv_java3");
        } catch (UnsatisfiedLinkError e) {
            System.loadLibrary("opencv_java4");
        }
    }

    protected FrameProcessor processor;
    protected CameraXPreviewHelper cameraHelper;
    private SurfaceTexture previewFrameTexture;
    private SurfaceView previewDisplayView;
    private EglManager eglManager;
    private ExternalTextureConverter converter;
    private ApplicationInfo applicationInfo;
    private final List<NormalizedLandmarkList> poseLandmarksHistory = new ArrayList<>();
    private final List<NormalizedLandmarkList> leftHandLandmarksHistory = new ArrayList<>();
    private final List<NormalizedLandmarkList> rightHandLandmarksHistory = new ArrayList<>();
    private final List<NormalizedLandmarkList> faceLandmarksHistory = new ArrayList<>();


    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_holistic_activity);
        Log.d(TAG, "onCreate çağrıldı.");

        try {
            applicationInfo = getPackageManager().getApplicationInfo(getPackageName(), PackageManager.GET_META_DATA);
        } catch (PackageManager.NameNotFoundException e) {
            Log.e(TAG, "Uygulama bilgileri bulunamadı: " + e);
            return;
        }

        previewDisplayView = new SurfaceView(this);
        setupPreviewDisplayView();

        AndroidAssetUtil.initializeNativeAssetManager(this);
        eglManager = new EglManager(null);

        String binaryGraphName = applicationInfo.metaData.getString("binaryGraphName");
        String inputStreamName = applicationInfo.metaData.getString("inputVideoStreamName");
        String outputStreamName = applicationInfo.metaData.getString("outputVideoStreamName");

        Log.d(TAG, "Binary Graph: " + binaryGraphName);
        Log.d(TAG, "Input Stream: " + inputStreamName);
        Log.d(TAG, "Output Stream: " + outputStreamName);

        if (binaryGraphName == null || inputStreamName == null || outputStreamName == null) {
            Log.e(TAG, "Meta veri eksik. AndroidManifest.xml kontrol edin.");
            return;
        }

        processor = new FrameProcessor(this, eglManager.getNativeContext(), binaryGraphName, inputStreamName, outputStreamName);

        if (processor.getVideoSurfaceOutput() != null) {
            processor.getVideoSurfaceOutput().setFlipY(applicationInfo.metaData.getBoolean("flipFramesVertically", FLIP_FRAMES_VERTICALLY));
        }

        PermissionHelper.checkAndRequestCameraPermissions(this);

        try {
            tfliteInterpreter = new Interpreter(loadModelFile());
            Log.d(TAG, "Model başarıyla yüklendi");
        } catch (IOException e) {
            Log.e(TAG, "Model yüklenemedi: " + e.getMessage());
        }

        // NormalizedLandmarkList türünü kaydet
        ProtoUtil.registerTypeName(NormalizedLandmarkList.class, "mediapipe.NormalizedLandmarkList");

        // Geri çağırmaları başlatın
        startLandmarkPacketCallbacks();
    }

    private void startLandmarkPacketCallbacks() {
        // Pose landmarks callback
        processor.addPacketCallback(
                "pose_landmarks",
                (packet) -> {
                    synchronized (landmarkLock) {
                        Log.d(TAG, "Pose landmarks paketi alındı.");
                        if (packet != null) {
                            try {
                                NormalizedLandmarkList landmarks = PacketGetter.getProto(packet, NormalizedLandmarkList.class);
                                if (landmarks != null) {
                                    Log.d(TAG, "Pose Landmark sayısı: " + landmarks.getLandmarkList().size());
                                    currentPoseLandmarks = landmarks;
                                    checkFrameComplete();
                                }
                            } catch (InvalidProtocolBufferException e) {
                                Log.e(TAG, "Pose landmarks hatası: " + e.getMessage());
                            }
                        }
                    }
                }
        );

        // Left hand landmarks callback
        processor.addPacketCallback(
                "left_hand_landmarks",
                (packet) -> {
                    synchronized (landmarkLock) {
                        Log.d(TAG, "Left hand landmarks paketi alındı.");
                        if (packet != null) {
                            try {
                                NormalizedLandmarkList landmarks = PacketGetter.getProto(packet, NormalizedLandmarkList.class);
                                if (landmarks != null) {
                                    Log.d(TAG, "Left Hand Landmark sayısı: " + landmarks.getLandmarkList().size());
                                    currentLeftHandLandmarks = landmarks;
                                    checkFrameComplete();
                                }
                            } catch (InvalidProtocolBufferException e) {
                                Log.e(TAG, "Left hand landmarks hatası: " + e.getMessage());
                            }
                        }
                    }
                }
        );

        // Right hand landmarks callback
        processor.addPacketCallback(
                "right_hand_landmarks",
                (packet) -> {
                    synchronized (landmarkLock) {
                        Log.d(TAG, "Right hand landmarks paketi alındı.");
                        if (packet != null) {
                            try {
                                NormalizedLandmarkList landmarks = PacketGetter.getProto(packet, NormalizedLandmarkList.class);
                                if (landmarks != null) {
                                    Log.d(TAG, "Right Hand Landmark sayısı: " + landmarks.getLandmarkList().size());
                                    currentRightHandLandmarks = landmarks;
                                    checkFrameComplete();
                                }
                            } catch (InvalidProtocolBufferException e) {
                                Log.e(TAG, "Right hand landmarks hatası: " + e.getMessage());
                            }
                        }
                    }
                }
        );

        // Face landmarks callback
        processor.addPacketCallback(
                "face_landmarks",
                (packet) -> {
                    synchronized (landmarkLock) {
                        Log.d(TAG, "Face landmarks paketi alındı.");
                        if (packet != null) {
                            try {
                                NormalizedLandmarkList landmarks = PacketGetter.getProto(packet, NormalizedLandmarkList.class);
                                if (landmarks != null) {
                                    Log.d(TAG, "Face Landmark sayısı: " + landmarks.getLandmarkList().size());
                                    currentFaceLandmarks = landmarks;
                                    checkFrameComplete();
                                }
                            } catch (InvalidProtocolBufferException e) {
                                Log.e(TAG, "Face landmarks hatası: " + e.getMessage());
                            }
                        }
                    }
                }
        );
    }

    private void checkFrameComplete() {
        if (currentPoseLandmarks != null && currentLeftHandLandmarks != null &&
                currentRightHandLandmarks != null && currentFaceLandmarks != null) {
            Log.d(TAG, "Frame için tüm landmarklar toplandı. İşlem başlatılıyor...");

            // Queue'lara ekle
            poseLandmarksQueue.offer(currentPoseLandmarks);
            leftHandLandmarksQueue.offer(currentLeftHandLandmarks);
            rightHandLandmarksQueue.offer(currentRightHandLandmarks);
            faceLandmarksQueue.offer(currentFaceLandmarks);

            // Queue boyutlarını kontrol et ve logla
            Log.d(TAG, String.format("Queue durumları - Pose: %d, Left: %d, Right: %d, Face: %d / %d gerekli",
                    poseLandmarksQueue.size(),
                    leftHandLandmarksQueue.size(),
                    rightHandLandmarksQueue.size(),
                    faceLandmarksQueue.size(),
                    LANDMARK_HISTORY_SIZE));

            // Queue'ları sınırla
            while (poseLandmarksQueue.size() > LANDMARK_HISTORY_SIZE) poseLandmarksQueue.poll();
            while (leftHandLandmarksQueue.size() > LANDMARK_HISTORY_SIZE) leftHandLandmarksQueue.poll();
            while (rightHandLandmarksQueue.size() > LANDMARK_HISTORY_SIZE) rightHandLandmarksQueue.poll();
            while (faceLandmarksQueue.size() > LANDMARK_HISTORY_SIZE) faceLandmarksQueue.poll();

            // Yeterli veri var mı kontrol et
            if (poseLandmarksQueue.size() == LANDMARK_HISTORY_SIZE &&
                    leftHandLandmarksQueue.size() == LANDMARK_HISTORY_SIZE &&
                    rightHandLandmarksQueue.size() == LANDMARK_HISTORY_SIZE &&
                    faceLandmarksQueue.size() == LANDMARK_HISTORY_SIZE) {

                Log.d(TAG, "History verisi tamamlandı. Model tahmini başlatılıyor...");
                processLandmarks();
            } else {
                Log.d(TAG, "Yeterli history verisi toplanmadı. Bekleniyor...");
            }

            resetCurrentFrame();
        }
    }

    private void resetCurrentFrame() {
        currentPoseLandmarks = null;
        currentLeftHandLandmarks = null;
        currentRightHandLandmarks = null;
        currentFaceLandmarks = null;
    }


    private synchronized void checkAndProcessLandmarks() {
        if (allLandmarksCollected()) {
            processLandmarks();
        }
    }

    private boolean allLandmarksCollected() {
        return poseLandmarksHistory.size() >= MIN_LANDMARK_COUNT && leftHandLandmarksHistory.size() >= MIN_LANDMARK_COUNT && rightHandLandmarksHistory.size() >= MIN_LANDMARK_COUNT && faceLandmarksHistory.size() >= MIN_LANDMARK_COUNT;
    }

    @Override
    protected void onResume() {
        super.onResume();
        if (converter == null) {
            converter = new ExternalTextureConverter(eglManager.getContext());
            converter.setFlipY(applicationInfo.metaData.getBoolean("flipFramesVertically", FLIP_FRAMES_VERTICALLY));
            converter.setConsumer(processor);
        }

        if (PermissionHelper.cameraPermissionsGranted(this)) {
            startCamera();
        }
    }

    @Override
    protected void onPause() {
        super.onPause();
        if (converter != null) {
            converter.close();
            converter = null;
        }
        if (cameraHelper != null) {
            previewDisplayView.setVisibility(View.GONE);
            previewFrameTexture = null;
        }
    }

    @Override
    protected void onDestroy() {
        super.onDestroy();
        if (converter != null) {
            converter.close();
            converter = null;
        }
        if (processor != null) {
            processor.close();
        }
        if (eglManager != null) {
            eglManager.release();
        }
    }

    @Override
    public void onRequestPermissionsResult(int requestCode, String[] permissions, int[] grantResults) {
        super.onRequestPermissionsResult(requestCode, permissions, grantResults);
        PermissionHelper.onRequestPermissionsResult(requestCode, permissions, grantResults);
    }

    protected void onCameraStarted(SurfaceTexture surfaceTexture) {
        previewFrameTexture = surfaceTexture;
        previewDisplayView.setVisibility(View.VISIBLE);
    }

    protected Size cameraTargetResolution() {
        return null; // No preference and let the camera (helper) decide.
    }

    public void startCamera() {
        if (cameraHelper != null) {
            cameraHelper = null;
        }
        cameraHelper = new CameraXPreviewHelper();
        cameraHelper.setOnCameraStartedListener(this::onCameraStarted);
        CameraHelper.CameraFacing cameraFacing = applicationInfo.metaData.getBoolean("cameraFacingFront", false) ? CameraHelper.CameraFacing.FRONT : CameraHelper.CameraFacing.BACK;
        cameraHelper.startCamera(this, cameraFacing, null, cameraTargetResolution());
    }

    protected Size computeViewSize(int width, int height) {
        return new Size(width, height);
    }

    protected void onPreviewDisplaySurfaceChanged(SurfaceHolder holder, int format, int width, int height) {
        Size viewSize = computeViewSize(width, height);
        Size displaySize = cameraHelper.computeDisplaySizeFromViewSize(viewSize);
        boolean isCameraRotated = cameraHelper.isCameraRotated();
        converter.setSurfaceTextureAndAttachToGLContext(previewFrameTexture, isCameraRotated ? displaySize.getHeight() : displaySize.getWidth(), isCameraRotated ? displaySize.getWidth() : displaySize.getHeight());
    }

    private void setupPreviewDisplayView() {
        previewDisplayView.setVisibility(View.GONE);
        ViewGroup viewGroup = findViewById(R.id.preview_display_layout);
        viewGroup.addView(previewDisplayView);
        previewDisplayView.getHolder().addCallback(new SurfaceHolder.Callback() {
            @Override
            public void surfaceCreated(SurfaceHolder holder) {
                processor.getVideoSurfaceOutput().setSurface(holder.getSurface());
            }

            @Override
            public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
                onPreviewDisplaySurfaceChanged(holder, format, width, height);
            }

            @Override
            public void surfaceDestroyed(SurfaceHolder holder) {
                processor.getVideoSurfaceOutput().setSurface(null);
            }
        });
    }

    private MappedByteBuffer loadModelFile() throws IOException {
        AssetFileDescriptor fileDescriptor = this.getAssets().openFd("model_mobil_deneme_3.tflite");
        FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
        FileChannel fileChannel = inputStream.getChannel();
        long startOffset = fileDescriptor.getStartOffset();
        long declaredLength = fileDescriptor.getDeclaredLength();
        return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
    }

    public List<String> loadLabels() {
        List<String> labels = new ArrayList<>();
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(getAssets().open("labels.txt")))) {
            String line;
            while ((line = reader.readLine()) != null) {
                labels.add(line.trim());
            }
        } catch (IOException e) {
            Log.e(TAG, "Etiket dosyası okunamadı: " + e.getMessage());
        }
        return labels;
    }

    private void processLandmarks() {
        new Thread(() -> {
            try {
                ByteBuffer inputBuffer = ByteBuffer.allocateDirect(LANDMARK_HISTORY_SIZE * FEATURES_PER_FRAME * Float.BYTES);
                inputBuffer.order(ByteOrder.nativeOrder());
                inputBuffer.clear();

                Log.d(TAG, "Input buffer hazırlanıyor...");

                // Her frame için feature'ları ekleyin
                for (int i = 0; i < LANDMARK_HISTORY_SIZE; i++) {
                    if (i < poseLandmarksQueue.size() &&
                            i < leftHandLandmarksQueue.size() &&
                            i < rightHandLandmarksQueue.size() &&
                            i < faceLandmarksQueue.size()) {

                        addReducedLandmarks(inputBuffer,
                                poseLandmarksQueue.toArray(new NormalizedLandmarkList[0])[i],        // 33 nokta * 2 = 66 feature
                                leftHandLandmarksQueue.toArray(new NormalizedLandmarkList[0])[i],    // 21 nokta * 2 = 42 feature
                                rightHandLandmarksQueue.toArray(new NormalizedLandmarkList[0])[i],   // 21 nokta * 2 = 42 feature
                                faceLandmarksQueue.toArray(new NormalizedLandmarkList[0])[i]         // 468 * 2 = 936 feature
                        );                                      // Toplam: 1086 feature

                        for (int j = 0; j < (FEATURES_PER_FRAME - 1086); j++) {
                            inputBuffer.putFloat(0.0f);
                        }
                    } else {
                        for (int j = 0; j < FEATURES_PER_FRAME; j++) {
                            inputBuffer.putFloat(0.0f);
                        }
                    }
                }

                inputBuffer.rewind();

                float[][] output = new float[1][226];
                tfliteInterpreter.run(inputBuffer, output);
                Log.d(TAG, "Model başarıyla çalıştırıldı");

                processModelOutput(output[0]);

            } catch (Exception e) {
                Log.e(TAG, "Tahmin işlemi sırasında hata: " + e.getMessage());
                e.printStackTrace();
            }
        }).start();
    }

    private void addReducedLandmarks(ByteBuffer buffer,
                                     NormalizedLandmarkList pose,
                                     NormalizedLandmarkList leftHand,
                                     NormalizedLandmarkList rightHand,
                                     NormalizedLandmarkList face) {
        for (LandmarkProto.NormalizedLandmark landmark : pose.getLandmarkList()) {
            buffer.putFloat(landmark.getX());
            buffer.putFloat(landmark.getY());
        }

        for (LandmarkProto.NormalizedLandmark landmark : leftHand.getLandmarkList()) {
            buffer.putFloat(landmark.getX());
            buffer.putFloat(landmark.getY());
        }

        for (LandmarkProto.NormalizedLandmark landmark : rightHand.getLandmarkList()) {
            buffer.putFloat(landmark.getX());
            buffer.putFloat(landmark.getY());
        }

        for (LandmarkProto.NormalizedLandmark landmark : face.getLandmarkList()) {
            buffer.putFloat(landmark.getX());
            buffer.putFloat(landmark.getY());
        }
    }
private void processModelOutput(float[] output) {
        int predictedClassIndex = -1;
        float maxProbability = -1;
        for (int i = 0; i < output.length; i++) {
            if (output[i] > maxProbability) {
                maxProbability = output[i];
                predictedClassIndex = i;
            }
        }

        List<String> classLabels = loadLabels();
        final String predictedLabel = (predictedClassIndex >= 0 && predictedClassIndex < classLabels.size())
                ? classLabels.get(predictedClassIndex) : "Bilinmiyor";
        final String result = String.format(Locale.US, "Tahmin: %s (%.2f%%)",
                predictedLabel, maxProbability * 100);
        Log.d(TAG, "Model tahmini: " + result);

        runOnUiThread(() -> {
            TextView resultView = findViewById(R.id.result_text_view);
            if (resultView != null) {
                resultView.setText(result);
            }
        });
    }}
@kuaashish kuaashish assigned kuaashish and unassigned kalyan2789g Jan 10, 2025
@kuaashish kuaashish added platform:android Issues with Android as Platform task:image classification Issues related to Image Classification: Identify content in images and video task:gesture recognition Issues related to hand gesture recognition: Identify and recognize hand gestures type:support General questions labels Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:android Issues with Android as Platform task:gesture recognition Issues related to hand gesture recognition: Identify and recognize hand gestures task:image classification Issues related to Image Classification: Identify content in images and video type:support General questions
Projects
None yet
Development

No branches or pull requests

3 participants