Deep learning (DL) techniques have demonstrated exceptional success in developing high-performing models for medical imaging applications. However, their effectiveness largely depends on access to extensive, high-quality labeled datasets, which are challenging to obtain in the medical field due to the high cost of annotation and privacy constraints. This dissertation introduces several novel deep-learning approaches aimed at addressing challenges associated with imperfect medical datasets, with the goal of reducing annotation efforts and enhancing the generalization capabilities of DL models. Specifically, two imperfect data challenges are studied in this dissertation. (1) Scarce annotation, where only a limited amount of labeled data is available for training. We propose several novel self-supervised learning techniques that leverage the inherent structure of medical images to improve representation learning. In addition, data augmentation with synthetic models is explored to generate synthetic images to improve self-supervised learning performance. (2) Weak annotation, in which the training data has only image-level annotation, noisy annotation, sparse annotation, or inconsistent annotation. We first introduce a novel self-supervised learning-based approach to better utilize the image level label for medical image semantic segmentation. Motivated by the large inter-observer variation in myocardial annotations for ultrasound images, we further propose an extended dice metric that integrates multiple annotations into the loss function, allowing the model to focus on learning generalizable features while minimizing variations caused by individual annotators.