This paper explores the integration of device-to-device communications into clustered federated learning (FL), where clients are grouped into multiple clusters based on the similarity of their learning tasks. To mitigate communication costs, we propose an efficient FL algorithm. Specifically, we designate a primary client within each cluster responsible for uploading the model to the server, while other clients within the cluster serve as secondary clients. Each secondary client assesses its model’s similarity to the primary client by computing a layer-wise model distance. If a secondary client’s model distance exceeds a predefined threshold, indicating a divergence from the primary client’s model, it transmits its model distance to the edge server. The primary client then updates the cluster model parameters and broadcasts them to the secondary clients within the cluster. Closed-form expressions for the time spent by the proposed layer-wise efficient FL is derived. Numerical results validate the training accuracy of the layer-wise efficient FL and demonstrate a notable reduction in communication costs compared to naive FL.