MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #1

MongoDB

MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #1

채마스 2023. 2. 10. 23:44

개요

사내에 우리가 만든 프레임워크에 대한 관심이 높아지면서, 몇몇 팀으로부터 프레임워크 적용 요청이 증가하고 있다.
소규모 솔루션의 경우에 RDBMS를 사용하지 않고 MongoDB만을 사용하는 팀들에서도 적용 가능한지에 대한 기술검토를 요구해 왔다. (작은 규모의 솔루션이라 RDB까지 구축하기 힘든 경우가 있다고 한다.)
그래서 기존 RDB로 짜인 로직을 어떻게 하면 MongoDB로 전환할 수 있을지를 고민하던 중 MongoDB Aggregation이라는 것을 알게 되었고, 이를 Spring Data MongoDB를 사용해서 구현할 수 있다는 사실도 알게 되었다.
결론적으로는 성공적으로 전환할 수 있었고, 그 과정을 정리해 보려고 한다.

MongoDB는 도큐먼트 기반의 NoSQL인데 RDBMS의 쿼리를 대체할 수 있을까? MongoDB에서 제공하는 Aggregation을 활용하면 RDBMS의 쿼리를 대부분 대체할 수 있다. (성능적으로는 비효율적이다.)

MongoDB Aggregation이란?

MongoDB의 Aggregation이란 Sharding 기반으로 데이터를 효율적으로 집계하는 프레임워크다.
MongoDB의 db.collection.find(query) 메서드만으로 검색하기 어려운 경우에 주로 사용된다. (RDB로 생각하면 join, group by.. 등이 필요한 경우)
파이프라인을 통해서 데이터를 원하는 형태로 검색할 수 있다.
파이프라인에는 여러 개의 Aggregation Operation들이 포함된다.
여러 개의 Aggregation Operation을 적절한 순으로 조합하면, MongoDB에서도 원하는 형태로 데이터를 조회할 수 있다.(마치 RDBMS처럼)
- 물론, MongoDB를 RDB처럼 설계하는 것이 좋지 않지만 어쩔 수 없이 복잡한 구조로 조회해야 할 때 사용하면 좋을 것 같다.

Aggregation Operation

MongoDB 에서는 여러 가지 종류의 Aggregation Operation을 제공해 준다.
Aggregation Operation을 Aggregation 파이프라인 스테이지라고도 한다.
db.collection.aggregate(pipeline, options)
- pipeline에는 여러 개의 스테이지(Aggregation Operation)들이 포함된다.
아래와 같은 Aggregation 파이프라인 스테이지가 존재한다.

스테이지(Operation)기능 설명

*$lookup*	입력 도큐먼트와 동일 데이터베이스 내의 다른 컬렉션과 Join을 실행하고, 그 결과를 다음 스테이지에서 사용한다. (RDB의 join과 유사하다.)
*$unwind*	입력 도큐먼트가 배열로 구성된 필드를 가지고 있으면 이를 여러 도큐먼트로 풀어서 다음 스테이지로 전달한다. (풀어내야 다음 스테이지에서 참조하기 편하다.)
*$project*	입력 도큐먼트에서 필요한 필드만 선별하거나 이름을 변경해서 다음 스테이지로 넘겨주는 작업을 처리한다.(RDB의 Projection과 유사하다.)
*$match*	직전 스테이지에서 넘어온 도큐먼트에서 조건에 일치하는 도큐먼트만 다음 스테이지로 넘겨주는 작업을 처리한다. (RDB의 where 절과 유사하다.)
*$sort*	입력 도큐먼트를 정렬해서 다음 스테이지로 전달하는 작업을 처리한다. (RDB의 order by와 유사하다.)
$group	입력으로 주어진 도큐먼트를 지정된 조건에 맞게 그룹핑해서 카운트나 합계 또는 평균 등의 계산하는 작업을 처리한다. (RDB의 group by와 유사하다.)
$limit	입력 도큐먼트에서 앞에서 부터 주어진 만큼만 다음 스테이지로 전달한다. (RDB의 limit 과 유사하다.)
$skip	입력 도큐먼트에서 앞에서 부터 주어진 만큼만 버리고 나머지 도큐먼트만 다음 스테이지로 전달한다. (limit의 반대 개념인것 같다.)
$count	입력 도큐먼트의 개수를 세어서 다음 스테이지로 전달하는 작업을 처리한다.
$addField	입력 도큐먼트에 새로운 필드를 추가하는 작업을 처리한다.
$replaceRoot	입력 도큐먼트에서 특정 필드를 최상위 도큐먼트로 만든 다음 나머지는 버린다.
$redact	도큐먼트의 각 필드 또는 서브 도큐먼트의 포맷이 다양한 경우에 지정된 형태의 포맷과 일치하는 서브 도큐먼트 또는 필드만으로 도큐먼트를 재구성할 때 사용된다.
$out	처리의 결과를 컬렉션으로 저장하거나 클라이언트로 직접 전달하는 작업을 처리한다.
$sample	입력 도큐먼트 중에서 임의로 몇 개의 도큐먼트만 샘플링해서 다음 스테이지로 전달한다.
$geoNear	주어진 위치를 기준으로 위치 기반의 검색을 수행해서 일정 반경 이내의 결과만 다음 스테이지로 전달한다.
$collStats	컬렉션의 상태 정보를 조회해서 다음 스테이지로 전달한다.
$indexStats	인덱스의 상태 정보를 조회해서 다음 스테이지로 전달한다.
$merge	Aggregation Pipeline의 결과 값을 컬렉션에 도큐먼트형태로 저장. $merge 스테이지는 항상 맨 마지막 위치해야 된다.
$unionWith	두 컬렉션을 하나로 결합하고, 두 컬렉션의 파이프라인 결과 값을 하나의 결과값으로 출력한다.
$facet	하나의 스테이지로 다양한 차원의 그룹핑 작업을 수행한다. $facet 스테이지는 $bucket과 $bucketAuto 그리고 $sortByCount 등의 서브 스테이지를 가진다.
$bucket	입력 도큐먼트를 여러 범위로 그룹핑한다. $group 스테이지는 유니크한 모든 값에 대해서 그룹을 생성하지만, $bucket은 사용자가 임의로 각 그룹의 범위를 설정할 수 있다.
$bucketAuto	$bucket 스테이지와 동일하지만, $bucketAuto는 사용자가 아닌 MongoDB 서버가 자동으로 그룹의 범위를 설정한다.
$sortByCount	도큐먼트의 필드를 기준으로 그룹핑해서 개수의 역순으로 정렬한 결과를 다음 스테이지로 전달한다.
$graphLookup	입력 도큐먼트와 동일 데이터베이스 내 다른 컬렉션과 그래프(재귀) 쿼리를 실행한다.

나는 이 중에서 $lookup, $match, $sort, $project, $unwind를 사용했다.

Aggregation Options

db.collection.aggregate(pipeline, options)
- options에는 aggregation을 실행하는 옵션이 들어간다.
아래와 같은 Aggregation option이 존재한다.

options기능 설명

allowDiskUse:	Aggregation() 명령은 기본적으로 정렬을 위해서 *100MB의 메모리까지 사용할 수 있다. 하지만, allowDiskUse 옵션을 true로 설정하면 디스크를 이용해서 정렬을 처리할 수 있다. 데이터가 저장되는 디렉터리 밑에 \tmp_* 라는 디렉토리를 만들어서 임시 가공용 데이터를 저장합니다.
explain	Aggregation() 명령의 *실행 계획*을 확인할 수 있다.
cursor	Aggregation() 명령의 결과로 반환되는 커서의 *배치 사이즈*를 설정할 수 있다..
maxTimeMS	Aggregation() 명령의 최대 실행시간을 설정한다.
readConcern	Aggregation() 명령이 도큐먼트의 개수를 확인할 때, 사용할 readConcern 옵션을 설정할 수 있다. (default: local)
bypassDocumentValidation	Aggregation() 명령의 결과를 다른 컬렉션으로 저장하는 경우에 컬렉션의 도큐먼트 유효성 체크를 무시할 것인지 설정할 수 있다.
collation	Aggregation()의 결과를 collation을 설정해서 원하는 형태로 정렬할 수 있다.

나는 이 중에서 allowDiskUse를 사용해 봤다.

MongoDB Aggregation 예시

MongoDB Aggregation는 Spring Data MongoDB에서도 지원해 준다.
Spring Data MongoDB도 JPA와 마찬가지로, QueryDsl을 제공하지만, 복잡한 구현이 안되고, Aggregation을 사용할 수 없어서 오히려 실무에서는 잘 사용하지 않는다고 한다.
그래서 나도 QueryDsl은 사용하지 않고, Spring Data MongoDB에서 제공하는 Aggregation만으로 구현하였다.
그럼 지금부터 코드 예시를 살펴보자.

$lookup

$lookup는 입력 도큐먼트와 동일 데이터베이스 내의 다른 컬렉션과 Join을 실행하고, 그 결과를 다음 스테이지에서 사용한다.
RDB의 join과 유사하다.

db.tenant.aggregate([        
  {            
    $lookup: {                
      from: "company",                
      let: {tenant_id: "$tenant_id"},                
      pipeline:[{                    
        $match: {                        
          $expr: {                            
            $and: [                                
              {$eq: ["$tenant_id", "$$tenant_id"]}                            
            ]                        
          }                    
        }                
      }],                
      as: "company"            
    },        
  },    
])

Spring Data MongoDB로 구현하면 아래와 같다.

AggregationOperation lookupOperation = (context) -> new Document(
        "$lookup",
        new Document("from", "company")
                .append("let", new Document("tenant_id", "$tenant_id"))
                .append("pipeline", List.of(
                        new Document("$match",
                                new Document("$expr",
                                        new Document("$and", List.of(
                                                new Document("$eq", List.of("$tenant_id", "$$tenant_id"))
                                        ))
                                )
                        )
                ))
                .append("as", "company")
);

쿼리를 비교해 보면, '{}'는 Document로 묶고, ', '는 append() 메서드로 표현할 수 있고, 배열형태는 List.of()로 표현 가능하다는 것을 알 수 있다.
좀만 들여다보면, 어떤 형태의 aggregation query라도 java 코드로 구현가능하다.
좀 더 편하게 구현할 수 있는 Aggregation.lookup() 메서드가 있지만, 좀 더 복잡한 join형태를 구현하기 위해서 위와 같이 직접 구현하였다.
나머지 Operation들은 Spring Data MongoDB에서 제공하는 static 메서드를 사용하였다.
사실 $lookup 이 가장 복잡하고, 나머지 Operation들은 비교적 단순하다.

$unwind

db.tenant.aggregate([        
  {            
    $unwind: {                
      path: "$customer",                
      preserveNullAndEmptyArrays: true,            
    }        
  },   
])

AggregationOperation unwindOperation1 = (context) -> new Document(
        "$unwind",
        new Document("path", "$company")
                .append("preserveNullAndEmptyArrays", true)
);

위에서 언급한 대로 Spring Data MongoDB에서 제공하는 Aggregation.unwind() 메서드를 통해서 더 간단하게 구현가능하다.

UnwindOperation unwindOperation = Aggregation.unwind("company", true);

$match

db.tenant.aggregate([        
  {            
    $match: {                
      $expr: {                    
        $and: [                        
          {$regexMatch: {input: "$tenant_code", regex: "T", options: "i"}},                        
          {$eq: ["$company.company_id", 1]}                    
        ]                
      }            
    }        
  },     
])

AggregationOperation matchOperation = (context) -> new Document(
        "$match",
        new Document("$expr",
                new Document(
                        "$and", List.of(
                        new Document("$regexMatch", new Document("input", "$tenant_code").append("regex", "T").append("options", "i")),
                        new Document("$eq", Arrays.asList("$company.company_code", "COM_1"))
                )
                )
        )
);

Aggregation.match() 메소드를 사용하려면, 아래와 같이 Parameter로 Criteria 배열을 넘겨주면 된다.

List<Criteria> criteriaList = new ArrayList<>();
criteriaList.add(Criteria.where("tenant_code").regex("T", "i"));
criteriaList.add(Criteria.where("company.company_code").is("COM_1"));
MatchOperation matchOperation = Aggregation.match(new Criteria()
        .andOperator(
                criteriaList.toArray(new Criteria[0])
        )
);

$sort

db.tenant.aggregate([        
  {            
    $sort: {"tenant_id": 1, "company.company_name": -1}
  },            
])

sort Operation도 아래와 같이 Aggregation.sort() 메소드를 사용해서 쉽게 구현가능하다.

SortOperation sortOperation = Aggregation.sort(Sort.Direction.DESC, "tenant_id")
                .and(Sort.Direction.ASC, "company.company_name");

$project

db.tenant.aggregate([        
  {            
    $project: {                
      "tenant_id": 1,                
      "tenant_name": 1,                
      "tenant_code": 1,                
      "language": 1,                
      "company.company_name": 1,                
      "company.company_id": 1,                
      "customer.customer_name": 1,                
      "customer.login_id": 1            
    }        
  }        
])

project Operation도 아래와 같이 Aggregation.project() 메소드를 사용해서 구현 가능하다.

ProjectionOperation projectionOperation = Aggregation.project()
        .and("tenant_id").as("tenantId")
        .and("tenant_name").as("tenantName")
        .and("tenant_code").as("tenantCode")
        .and("language").as("language")
        .and("company.company_name").as("companyName")
        .and("company.company_id").as("companyId")
        .and("customer.customer_name").as("customerName")
        .and("customer.login_id").as("loginId");

MongoDB Aggregation에 대해서 알아봤으니 이제 본격적으로 MongoDB Aggregation로 QueryDSL-JPA의 로직을 대체하는 방법에 대해서 알아보자.

QueryDsl-JPA 예시

ANSI Query와 MongoDB Aggregation를 비교해 보면 아래와 같다.
우선, tenant, company, customer 라는 세 개의 테이블이 있다고 가정하자.
company는 tenantId를 가지고 있고, customer은 companyId를 가지고 있다.
먼저, 아래와 같은 쿼리가 있다고 가정하자.

select 
  tenant_id as 'tenantId',              
  tenant_name as 'tenantName'            
  tenant_code as 'tenantCode',                
  language as 'language',                
  company.company_name as 'companyName',                
  company.company_id as 'companyId',                
  customer.customer_name as 'customerName',                
  customer.login_id as 'loginId'
from tenant
left join company on tenant.tenant_id = company.tenant_id
left join customer on company.company_id = customer.company_id
where company.tenant_id = 1    
  and lower(tenant_code) like lower('%T%')    
  and company_cd = 'COM_1'
order by company.company_id;

위의 쿼리를 QueryDsl 로 구현하면 아래와 같다.

@RequiredArgsConstructor
public class TenantRepositoryImpl implements TenantRepositoryCustom{

    private final JPAQueryFactory queryFactory;

    @Override
    public List<TenantDto> searchTenantList(TenantDto tenantDto) {
        return queryFactory
                .select(Projections.fields(TenantDto.class,
                        tenant.tenantId.as("tenantId"),
                        tenant.tenantName.as("tenantName"),
                        tenant.tenantCode.as("tenantCode"),
                        tenant.language.as("language"),
                        company.companyName.as("companyName"),
                        company.companyId.as("companyId"),
                        customer.customerName.as("customerName"),
                        customer.loginId.as("loginId")
                        )
                )
                .from(tenant)
                .leftJoin(company).on(tenant.tenantId.eq(company.tenantId))
                .leftJoin(customer).on(company.companyId.eq(customer.companyId))
                .where(
                        tenantCodeContains(tenantDto.getTenantCode()),
                        companyCodeEq(tenantDto.getCompanyCode())
                )
                .orderBy(
                        tenant.tenantId.desc(),
                        company.companyName.asc()
                )
                .fetch();
    }

    private BooleanExpression tenantCodeContains(String tenantCode) {
        return StringUtils.hasText(tenantCode) ? tenant.tenantCode.containsIgnoreCase(tenantCode) : null;
    }

    private BooleanExpression companyCodeEq(String companyCode) {
        return StringUtils.hasText(companyCode) ? company.companyCode.eq(companyCode) : null;
    }
}

이제 위의 쿼리를 Spring Data MongoDB 를 사용해서 변환해 보자.

Spring Data MongoDB

위에서 봤던 ANSI Query를 아래와 같은 규칙으로 변환해 보자.

left join -> $lookup + $unwind
where 절(동적 쿼리) -> $match
projection -> $project
order by -> $sort

변환 결과

db.tenant.aggregate([        
  {            
    $lookup: {                
      from: "company",                
      let: {tenant_id: "$tenant_id"},                
      pipeline:[{                    
        $match: {                        
          $expr: {                            
            $and: [                                
              {$eq: ["$tenant_id", "$$tenant_id"]}                            
              ]                        
          }                    
        }                
      }],                
      as: "company"            
    },        
  },        
  {            
    $unwind: {                
      path: "$company",                
      preserveNullAndEmptyArrays: true,            
    }        
  },        
  {            
    $lookup: {                
      from: "customer",                
      let: {company_id: "$company_id"},                
      pipeline:[{                    
        $match: {                        
          $expr: {                            
            $and: [                                
              {$eq: ["$company.company_id", "$$company_id"]}                            
            ]                        
          }                    
        }                
      }],                
      as: "customer"            
    },        
  },        
  {            
    $unwind: {                
      path: "$customer",                
      preserveNullAndEmptyArrays: true,            
    }        
  },        
  {            
    $match: {                
      $expr: {                    
        $and: [                        
          {$regexMatch: {input: "$tenant_code", regex: "T", options: "i"}},                        
          {$eq: ["$company.company_id", 1]}                    
        ]                
      }            
    }        
  },        
  {            
    $sort: {"tenant_id": 1}        
  },        
  {            
    $project: {                
      "tenant_id": 1,                
      "tenant_name": 1,                
      "tenant_code": 1,                
      "language": 1,                
      "company.company_name": 1,                
      "company.company_id": 1,                
      "customer.customer_name": 1,                
      "customer.login_id": 1            
    }        
  }    
])

위의 MongoDB Aggregation을 Spring Data MongoDB로 구현하면 아래와 같다.

final List<AggregationOperation> operationList = new ArrayList<>();

AggregationOperation lookupOperation1 = (context) -> new Document(
        "$lookup",
        new Document("from", "company")
                .append("let", new Document("tenant_id", "$tenant_id"))
                .append("pipeline", List.of(
                        new Document("$match",
                                new Document("$expr",
                                        new Document("$and", List.of(
                                                new Document("$eq", List.of("$tenant_id", "$$tenant_id"))
                                        ))
                                )
                        )
                ))
                .append("as", "company")
);

AggregationOperation unwindOperation1 = (context) -> new Document(
        "$unwind",
        new Document("path", "$company")
                .append("preserveNullAndEmptyArrays", true)
);

AggregationOperation lookupOperation2 = (context) -> new Document(
        "$lookup",
        new Document("from", "customer")
                .append("let", new Document("company_id", "$company_id"))
                .append("pipeline", List.of(
                        new Document("$match",
                                new Document("$expr",
                                        new Document("$and", List.of(
                                                new Document("$eq", List.of("$company.company_id", "$$company_id"))
                                        ))
                                )
                        )
                ))
                .append("as", "customer")
);

AggregationOperation unwindOperation2 = (context) -> new Document(
        "$unwind",
        new Document("path", "$customer")
                .append("preserveNullAndEmptyArrays", true)
);

AggregationOperation matchOperation = (context) -> new Document(
        "$match",
        new Document("$expr",
                new Document(
                        "$and", List.of(
                        new Document("$regexMatch", new Document("input", "$tenant_code").append("regex", "T").append("options", "i")),
                        new Document("$eq", Arrays.asList("$company.company_code", "COM_1"))
                )
                )
        )
);

SortOperation sortOperation = Aggregation.sort(Sort.Direction.DESC, "tenant_id")
                .and(Sort.Direction.ASC, "company.company_name");

ProjectionOperation projectionOperation = Aggregation.project()
        .and("tenant_id").as("tenantId")
        .and("tenant_name").as("tenantName")
        .and("tenant_code").as("tenantCode")
        .and("language").as("language")
        .and("company.company_name").as("companyName")
        .and("company.company_id").as("companyId")
        .and("customer.customer_name").as("customerName")
        .and("customer.login_id").as("loginId");

operationList.add(lookupOperation1);
operationList.add(unwindOperation1);
operationList.add(lookupOperation2);
operationList.add(unwindOperation2);
operationList.add(matchOperation);
operationList.add(sortOperation);
operationList.add(projectionOperation);

//aggregate 실행
Aggregation aggregation = Aggregation.newAggregation(
        operationList.toArray(new AggregationOperation[0])
);

List<TenantDto> results = mongoTemplate.aggregate(aggregation, "tenant", TenantDto.class)
        .getMappedResults();

위에서 $lookup Operation 다음에 $unwind Operation을 구현했다. -> 이렇게 되면 RDB의 left join과 유사하게 동작한다.
- 만약, $unwind의 preserveNullAndEmptyArrays 값을 false 로 설정하면 inner join과 유사하게 동작한다.
- 실무에서 QueryDsl을 변환할 때, $lookup + $unwind를 합해서 left join을 변환했다.

우선 MongoDB Aggregation을 사용해서 QueryDSL-JPA 로직을 변환하는 방법은 알아보았다. 하지만 보다시피 코드가 길고 복잡하다. 다음 편에서는 MongoDB Aggregation 로직을 QueryDSL 처럼 사용하기 편한 형태로 구현할 수 있는 Builder 클래스를 구현해 보자.

References

https://rastalion.me/mongodb-aggregation/

저작자표시 (새창열림)

'MongoDB' 카테고리의 다른 글

MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #2 (0)	2023.02.10

현재글MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #1

채마스의 개발창고

MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #1

개요

MongoDB는 도큐먼트 기반의 NoSQL인데 RDBMS의 쿼리를 대체할 수 있을까? MongoDB에서 제공하는 Aggregation을 활용하면 RDBMS의 쿼리를 대부분 대체할 수 있다. (성능적으로는 비효율적이다.)

MongoDB Aggregation이란?

Aggregation Operation

Aggregation Options

MongoDB Aggregation 예시

$lookup

$unwind

$match

$sort

$project

MongoDB Aggregation에 대해서 알아봤으니 이제 본격적으로 MongoDB Aggregation로 QueryDSL-JPA의 로직을 대체하는 방법에 대해서 알아보자.

QueryDsl-JPA 예시

Spring Data MongoDB

변환 결과

References

'MongoDB' 카테고리의 다른 글

'MongoDB'의 다른글

티스토리툴바

MongoDB Aggregation를 활용해서 QueryDSL-JPA 대체하기 #1

개요

MongoDB는 도큐먼트 기반의 NoSQL인데 RDBMS의 쿼리를 대체할 수 있을까? MongoDB에서 제공하는 Aggregation을 활용하면 RDBMS의 쿼리를 대부분 대체할 수 있다. (성능적으로는 비효율적이다.)

MongoDB Aggregation이란?

Aggregation Operation

Aggregation Options

MongoDB Aggregation 예시

$lookup

$unwind

$match

$sort

$project

MongoDB Aggregation에 대해서 알아봤으니 이제 본격적으로 MongoDB Aggregation로 QueryDSL-JPA의 로직을 대체하는 방법에 대해서 알아보자.

QueryDsl-JPA 예시

Spring Data MongoDB

변환 결과

References

'MongoDB' 카테고리의 다른 글

'MongoDB'의 다른글

관련글

티스토리툴바