Skip to contents

Intro

This tutorial show how to append additional information to a longitudinal or service-sector table in bigquery using the bigrquery package.

Information to add includes, name of the medical codes in English, name for the type of visit, name for the type of provider.

This tutorial also included how to append additional information to the Kanta lab data. In particular, the name of the OMOP concept for the lab test.

Set up

This tutorial has been developed using dummy data under the Atlas development project. To execute it in Sandbox, be sure to change the following paths.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(bigrquery)
library(FinnGenUtilsR)


bigrquery::bq_auth(path = Sys.getenv("GCP_SERVICE_KEY"))
project_id <- "atlas-development-270609"

test_longitudinal_data_table <- "atlas-development-270609.sandbox_tools_r12.finngen_r12_service_sector_detailed_longitudinal_v1"
fg_codes_info_table <- "atlas-development-270609.medical_codes.fg_codes_info_v7"

tmp_schema <- "sandbox"

kanta_lab_data_table <- "atlas-development-270609.sandbox_tools_r12.kanta_r12_v1"
omop_schema <- "atlas-development-270609.etl_sam_r12_5k_omop"

SERVICE SECTOR

Generate a table in service sector format

For example we create a subset of the main service sector table by querying latest 20 events from subject “FG00000001”:

sql <- paste("SELECT * FROM ", test_longitudinal_data_table, "WHERE FINNGENID='FG00000001' ORDER BY APPROX_EVENT_DAY DESC LIMIT 20")
event_for_subject_1 <- bq_project_query(project_id, sql)

We can see that this table does not currently have information about the medical codes:

bq_table_download(event_for_subject_1)
#> # A tibble: 20 × 16
#>    FINNGENID  SOURCE   EVENT_AGE APPROX_EVENT_DAY CODE1  CODE2 CODE3 CODE4 CODE5
#>    <chr>      <chr>        <dbl> <date>           <chr>  <chr> <chr> <chr> <chr>
#>  1 FG00000001 PRIM_OUT      47.1 2023-06-20       WYA05  NA    NA    NA    R10  
#>  2 FG00000001 OUTPAT        47.0 2023-05-22       N3282  NA    NA    NA    NA   
#>  3 FG00000001 OUTPAT        47.0 2023-05-22       C5041  NA    NA    NA    NA   
#>  4 FG00000001 OUTPAT        47.0 2023-05-22       F03    NA    NA    NA    NA   
#>  5 FG00000001 OUTPAT        47.0 2023-05-22       C5691  NA    NA    NA    NA   
#>  6 FG00000001 OUTPAT        47.0 2023-05-22       G564   G564  NA    NA    NA   
#>  7 FG00000001 OUTPAT        47.0 2023-05-22       Z992   NA    NA    NA    NA   
#>  8 FG00000001 OUTPAT        47.0 2023-05-22       M069   NA    NA    NA    NA   
#>  9 FG00000001 OUTPAT        47.0 2023-05-22       W06    NA    NA    NA    NA   
#> 10 FG00000001 OUTPAT        47.0 2023-05-22       F4322  NA    NA    NA    NA   
#> 11 FG00000001 PRIM_OUT      46.8 2023-03-24       WX290  NA    NA    NA    R10  
#> 12 FG00000001 PRIM_OUT      46.8 2023-03-21       NH3BA  NA    NA    NA    R10  
#> 13 FG00000001 PRIM_OUT      46.8 2023-03-21       SPAT1… NA    NA    NA    R10  
#> 14 FG00000001 PRIM_OUT      46.8 2023-03-21       Z012   NA    NA    NA    R10  
#> 15 FG00000001 PRIM_OUT      46.8 2023-03-03       L40    NA    NA    NA    R50  
#> 16 FG00000001 PURCH         46.6 2022-12-31       A12BA… 205   3983… 1     NA   
#> 17 FG00000001 PURCH         46.6 2022-12-25       L04AD… 127   4675… 1     NA   
#> 18 FG00000001 PRIM_OUT      46.6 2022-12-25       SPAT1… NA    NA    NA    R20  
#> 19 FG00000001 PRIM_OUT      46.6 2022-12-25       SPAT1… NA    NA    NA    R20  
#> 20 FG00000001 PRIM_OUT      46.6 2022-12-25       M750   NA    NA    NA    R20  
#> # ℹ 7 more variables: CODE6 <chr>, CODE7 <chr>, CODE8 <chr>, CODE9 <chr>,
#> #   ICDVER <chr>, CATEGORY <chr>, INDEX <chr>
(same table in html format for exploration)

Adding info to medical codes

Function fg_bq_append_code_info_to_longitudinal_data adds new columns with information about the medical code used in each event/row.

event_for_subject_1_with_translations <- fg_bq_append_code_info_to_longitudinal_data(
  project_id, event_for_subject_1, fg_codes_info_table)

For example: code, for the original code; name_en, for the name of the code in English; name_fi for the name of the code in Finnish; and omop_concept_id, to find additional info in Atlas.

bq_table_download(event_for_subject_1_with_translations) |>  
  dplyr::select(FINNGENID, SOURCE, APPROX_EVENT_DAY, CODE1, CODE2, CODE3, code, name_en, name_fi, omop_concept_id) 
#> # A tibble: 20 × 10
#>    FINNGENID  SOURCE   APPROX_EVENT_DAY CODE1  CODE2 CODE3 code  name_en name_fi
#>    <chr>      <chr>    <date>           <chr>  <chr> <chr> <chr> <chr>   <chr>  
#>  1 FG00000001 PRIM_OUT 2023-06-20       WYA05  NA    NA    WYA05 Emerge… Ensiap…
#>  2 FG00000001 OUTPAT   2023-05-22       N3282  NA    NA    N32.… Instab… Virtsa…
#>  3 FG00000001 OUTPAT   2023-05-22       C5041  NA    NA    C50.… Malign… Rinnan…
#>  4 FG00000001 OUTPAT   2023-05-22       F03    NA    NA    F03   Unspec… NA     
#>  5 FG00000001 OUTPAT   2023-05-22       C5691  NA    NA    C56.… Malign… Munasa…
#>  6 FG00000001 OUTPAT   2023-05-22       G564   G564  NA    G56.4 Causal… NA     
#>  7 FG00000001 OUTPAT   2023-05-22       Z992   NA    NA    Z99.2 Depend… NA     
#>  8 FG00000001 OUTPAT   2023-05-22       M069   NA    NA    M06.9 Rheuma… NA     
#>  9 FG00000001 OUTPAT   2023-05-22       W06    NA    NA    W06   Fall i… NA     
#> 10 FG00000001 OUTPAT   2023-05-22       F4322  NA    NA    F43.… Adapta… Sopeut…
#> 11 FG00000001 PRIM_OUT 2023-03-24       WX290  NA    NA    WX290 Other … Muu jo…
#> 12 FG00000001 PRIM_OUT 2023-03-21       NH3BA  NA    NA    NA    NA      NA     
#> 13 FG00000001 PRIM_OUT 2023-03-21       SPAT1… NA    NA    SPAT… Admini… Lääkke…
#> 14 FG00000001 PRIM_OUT 2023-03-21       Z012   NA    NA    Z01.2 Dental… NA     
#> 15 FG00000001 PRIM_OUT 2023-03-03       L40    NA    NA    L40   Psoria… NA     
#> 16 FG00000001 PURCH    2022-12-31       A12BA… 205   3983… A12B… potass… NA     
#> 17 FG00000001 PURCH    2022-12-25       L04AD… 127   4675… L04A… ciclos… NA     
#> 18 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 19 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 20 FG00000001 PRIM_OUT 2022-12-25       M750   NA    NA    M75.0 Adhesi… NA     
#> # ℹ 1 more variable: omop_concept_id <chr>
(same table in html format for exploration)

If an event/row uses more than one medical code

Some events include more than one medical code, or more than one way to understand a medical code. Parameters in fg_bq_append_code_info_to_longitudinal_data allows to chose how to name the medical code in this situations.

For example:

  • PURCH_map_to = “VNR”, tells to use the VNR code in PURCH register, instead of the default ATC
  • ICD10fi_map_to = “CODE1”, tells to use only the first code in ICD10fi, instead of the default CODE1*CODE2 combination code
event_for_subject_1_with_translations <- fg_bq_append_code_info_to_longitudinal_data(
  project_id, event_for_subject_1, fg_codes_info_table, 
    PURCH_map_to = "VNR", 
    ICD10fi_map_to = "CODE1"
)
one_patient_long_data <- bq_table_download(event_for_subject_1_with_translations)
bq_table_download(event_for_subject_1_with_translations) |>  
  dplyr::select(FINNGENID, SOURCE, APPROX_EVENT_DAY, CODE1, CODE2, CODE3, code, name_en, name_fi, omop_concept_id) 
#> # A tibble: 20 × 10
#>    FINNGENID  SOURCE   APPROX_EVENT_DAY CODE1  CODE2 CODE3 code  name_en name_fi
#>    <chr>      <chr>    <date>           <chr>  <chr> <chr> <chr> <chr>   <chr>  
#>  1 FG00000001 PRIM_OUT 2023-06-20       WYA05  NA    NA    WYA05 Emerge… Ensiap…
#>  2 FG00000001 OUTPAT   2023-05-22       N3282  NA    NA    N32.… Instab… Virtsa…
#>  3 FG00000001 OUTPAT   2023-05-22       C5041  NA    NA    C50.… Malign… Rinnan…
#>  4 FG00000001 OUTPAT   2023-05-22       F03    NA    NA    F03   Unspec… NA     
#>  5 FG00000001 OUTPAT   2023-05-22       C5691  NA    NA    C56.… Malign… Munasa…
#>  6 FG00000001 OUTPAT   2023-05-22       G564   G564  NA    G56.4 Causal… NA     
#>  7 FG00000001 OUTPAT   2023-05-22       Z992   NA    NA    Z99.2 Depend… NA     
#>  8 FG00000001 OUTPAT   2023-05-22       M069   NA    NA    M06.9 Rheuma… NA     
#>  9 FG00000001 OUTPAT   2023-05-22       W06    NA    NA    W06   Fall i… NA     
#> 10 FG00000001 OUTPAT   2023-05-22       F4322  NA    NA    F43.… Adapta… Sopeut…
#> 11 FG00000001 PRIM_OUT 2023-03-24       WX290  NA    NA    WX290 Other … Muu jo…
#> 12 FG00000001 PRIM_OUT 2023-03-21       NH3BA  NA    NA    NA    NA      NA     
#> 13 FG00000001 PRIM_OUT 2023-03-21       SPAT1… NA    NA    SPAT… Admini… Lääkke…
#> 14 FG00000001 PRIM_OUT 2023-03-21       Z012   NA    NA    Z01.2 Dental… NA     
#> 15 FG00000001 PRIM_OUT 2023-03-03       L40    NA    NA    L40   Psoria… NA     
#> 16 FG00000001 PURCH    2022-12-31       A12BA… 205   3983… 3983… DUREKA… DUREKA…
#> 17 FG00000001 PURCH    2022-12-25       L04AD… 127   4675… 4675… SANDIM… SANDIM…
#> 18 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 19 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 20 FG00000001 PRIM_OUT 2022-12-25       M750   NA    NA    M75.0 Adhesi… NA     
#> # ℹ 1 more variable: omop_concept_id <chr>
(same table in html format for exploration)

Se the help, for more options

?fg_append_code_info_to_longitudinal_data_sql

Reduce code precision

ICD10fi, ICD9fi, ICD8fi, ATC, and NCSPfi, code systems reflect the hierarchy of the code by the first letters on them.

It is possible to truncate these medical codes before the information is added. This in practice adds the code and names of the parent codes.

event_for_subject_1_with_translations <- fg_bq_append_code_info_to_longitudinal_data(
  project_id, event_for_subject_1, fg_codes_info_table, 
  ICD10fi_precision = 3,
  ICD9fi_precision = 3,
  ICD8fi_precision = 3,
  ATC_precision = 3,
  NCSPfi_precision = 2
)

We can see in column code the truncated code and in name_en the name of the truncated code.

bq_table_download(event_for_subject_1_with_translations) |>  
  dplyr::select(FINNGENID, SOURCE, APPROX_EVENT_DAY, CODE1, CODE2, CODE3, code, name_en, name_fi, omop_concept_id) 
#> # A tibble: 20 × 10
#>    FINNGENID  SOURCE   APPROX_EVENT_DAY CODE1  CODE2 CODE3 code  name_en name_fi
#>    <chr>      <chr>    <date>           <chr>  <chr> <chr> <chr> <chr>   <chr>  
#>  1 FG00000001 PRIM_OUT 2023-06-20       WYA05  NA    NA    WY    Käynti… Käynti…
#>  2 FG00000001 OUTPAT   2023-05-22       N3282  NA    NA    N32   Other … NA     
#>  3 FG00000001 OUTPAT   2023-05-22       C5041  NA    NA    C50   Malign… NA     
#>  4 FG00000001 OUTPAT   2023-05-22       F03    NA    NA    F03   Unspec… NA     
#>  5 FG00000001 OUTPAT   2023-05-22       C5691  NA    NA    C56   Malign… NA     
#>  6 FG00000001 OUTPAT   2023-05-22       G564   G564  NA    G56   Monone… NA     
#>  7 FG00000001 OUTPAT   2023-05-22       Z992   NA    NA    Z99   Depend… NA     
#>  8 FG00000001 OUTPAT   2023-05-22       M069   NA    NA    M06   Other … NA     
#>  9 FG00000001 OUTPAT   2023-05-22       W06    NA    NA    W06   Fall i… NA     
#> 10 FG00000001 OUTPAT   2023-05-22       F4322  NA    NA    F43   Reacti… NA     
#> 11 FG00000001 PRIM_OUT 2023-03-24       WX290  NA    NA    WX    Anaest… Aneste…
#> 12 FG00000001 PRIM_OUT 2023-03-21       NH3BA  NA    NA    NA    NA      NA     
#> 13 FG00000001 PRIM_OUT 2023-03-21       SPAT1… NA    NA    SPAT… Admini… Lääkke…
#> 14 FG00000001 PRIM_OUT 2023-03-21       Z012   NA    NA    Z01   Other … NA     
#> 15 FG00000001 PRIM_OUT 2023-03-03       L40    NA    NA    L40   Psoria… NA     
#> 16 FG00000001 PURCH    2022-12-31       A12BA… 205   3983… A12   MINERA… NA     
#> 17 FG00000001 PURCH    2022-12-25       L04AD… 127   4675… L04   IMMUNO… NA     
#> 18 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 19 FG00000001 PRIM_OUT 2022-12-25       SPAT1… NA    NA    SPAT… No pro… Ei toi…
#> 20 FG00000001 PRIM_OUT 2022-12-25       M750   NA    NA    M75   Should… NA     
#> # ℹ 1 more variable: omop_concept_id <chr>
(same table in html format for exploration)

Adding info to visit type

Function fg_bq_append_visit_type_info_to_service_sector_data adds new columns with information about the visit type.

In the service-sector data, information about the visit type is defined in CODE5 to CODE9 depending on the SOURCE and time period. fg_bq_append_visit_type_info_to_service_sector_data abstracts this nuances, and assign one visit type code per row. Notice that a visit may contain more than one event/row. Events/rows belonging to the same visit share the same combination of SOURCE+INDEX.

By default, fg_bq_append_visit_type_info_to_service_sector_data will also include two columns is_clinic_visit and is_follow_up_visit that will be TRUE if the visit is a clinic visit or a follow-up visit, respectively. However, this works only if fg_codes_info_table version is v7 or higher. If you need to use lower versions set the parameters add_is_clinic_visist and add_is_follow_up_visit to FALSE. See the help for more details.

event_for_subject_1_with_visit_type <- fg_bq_append_visit_type_info_to_service_sector_data(
  project_id, event_for_subject_1, fg_codes_info_table
  )
bigrquery::bq_table_download(event_for_subject_1_with_visit_type) |> 
  dplyr::select(FINNGENID, SOURCE, INDEX, APPROX_EVENT_DAY, CODE5, CODE6, CODE8, CODE9, visit_type_code, visit_type_name_en, is_clinic_visit, is_follow_up_visit)
#> # A tibble: 20 × 12
#>    FINNGENID  SOURCE   INDEX APPROX_EVENT_DAY CODE5 CODE6 CODE8 CODE9
#>    <chr>      <chr>    <chr> <date>           <chr> <chr> <chr> <chr>
#>  1 FG00000001 PRIM_OUT 326   2023-06-20       R10   T11   NA    NA   
#>  2 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  3 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  4 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  5 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  6 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  7 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  8 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#>  9 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#> 10 FG00000001 OUTPAT   97    2023-05-22       NA    80A   R52   E    
#> 11 FG00000001 PRIM_OUT 287   2023-03-24       R10   T11   NA    NA   
#> 12 FG00000001 PRIM_OUT 24    2023-03-21       R10   T60   NA    NA   
#> 13 FG00000001 PRIM_OUT 24    2023-03-21       R10   T60   NA    NA   
#> 14 FG00000001 PRIM_OUT 24    2023-03-21       R10   T60   NA    NA   
#> 15 FG00000001 PRIM_OUT 273   2023-03-03       R50   T11   NA    NA   
#> 16 FG00000001 PURCH    89    2022-12-31       NA    NA    NA    NA   
#> 17 FG00000001 PURCH    68    2022-12-25       NA    NA    NA    NA   
#> 18 FG00000001 PRIM_OUT 99    2022-12-25       R20   T41   NA    NA   
#> 19 FG00000001 PRIM_OUT 99    2022-12-25       R20   T41   NA    NA   
#> 20 FG00000001 PRIM_OUT 99    2022-12-25       R20   T41   NA    NA   
#> # ℹ 4 more variables: visit_type_code <chr>, visit_type_name_en <chr>,
#> #   is_clinic_visit <lgl>, is_follow_up_visit <lgl>
(same table in html format for exploration)

Adding info to provider type

Function fg_bq_append_provider_info_to_service_sector_data adds new columns with information about the personal or unit that provided the diagnose during the visit.

In the service-sector data, information about the provider is defined in CODE6 to CODE7 depending on the SOURCE. fg_bq_append_visit_type_info_to_service_sector_data abstracts this nuances, and assign one provider type code per row.

event_for_subject_1_with_provider <- fg_bq_append_provider_info_to_service_sector_data(
  project_id, event_for_subject_1, fg_codes_info_table
  )
bigrquery::bq_table_download(event_for_subject_1_with_provider) |> 
  dplyr::select(FINNGENID, SOURCE, INDEX, APPROX_EVENT_DAY, CODE5, CODE6, CODE8, CODE9, provider_code, provider_name_en, provider_concept_class_id)
#> # A tibble: 20 × 11
#>    FINNGENID SOURCE INDEX APPROX_EVENT_DAY CODE5 CODE6 CODE8 CODE9 provider_code
#>    <chr>     <chr>  <chr> <date>           <chr> <chr> <chr> <chr> <chr>        
#>  1 FG000000… PRIM_… 326   2023-06-20       R10   T11   NA    NA    NA           
#>  2 FG000000… PURCH  68    2022-12-25       NA    NA    NA    NA    NA           
#>  3 FG000000… PURCH  89    2022-12-31       NA    NA    NA    NA    NA           
#>  4 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#>  5 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#>  6 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#>  7 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#>  8 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#>  9 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#> 10 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#> 11 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#> 12 FG000000… OUTPAT 97    2023-05-22       NA    80A   R52   E     80A          
#> 13 FG000000… PRIM_… 99    2022-12-25       R20   T41   NA    NA    51321        
#> 14 FG000000… PRIM_… 99    2022-12-25       R20   T41   NA    NA    51321        
#> 15 FG000000… PRIM_… 99    2022-12-25       R20   T41   NA    NA    51321        
#> 16 FG000000… PRIM_… 24    2023-03-21       R10   T60   NA    NA    2222         
#> 17 FG000000… PRIM_… 24    2023-03-21       R10   T60   NA    NA    2222         
#> 18 FG000000… PRIM_… 24    2023-03-21       R10   T60   NA    NA    2222         
#> 19 FG000000… PRIM_… 287   2023-03-24       R10   T11   NA    NA    32311        
#> 20 FG000000… PRIM_… 273   2023-03-03       R50   T11   NA    NA    32311        
#> # ℹ 2 more variables: provider_name_en <chr>, provider_concept_class_id <chr>
(same table in html format for exploration)

KANTA LAB DATA

Generate a table in service sector format

For example we create a subset of the main service sector table by querying latest 20 events from subject “FG00000001”:

sql <- paste("SELECT * FROM ", kanta_lab_data_table, "WHERE FINNGENID='FG00000001' ORDER BY APPROX_EVENT_DATETIME DESC LIMIT 20")
kanta_events_for_subject_1 <- bq_project_query(project_id, sql)

We can see that this table does not currently have information about the OMOP_CONCEPT_ID:

bq_table_download(kanta_events_for_subject_1)
#> # A tibble: 20 × 19
#>    FINNGENID  EVENT_AGE APPROX_EVENT_DATETIME TEST_NAME  TEST_ID OMOP_CONCEPT_ID
#>    <chr>          <dbl> <dttm>                <chr>      <chr>             <int>
#>  1 FG00000001      71.7 2022-12-07 12:07:00   pt-gfre    656400               -1
#>  2 FG00000001      71.6 2022-11-04 16:18:00   p-crp      1605013         3020460
#>  3 FG00000001      71.6 2022-10-17 12:00:11   b-trom     3002791         3007461
#>  4 FG00000001      71.6 2022-10-15 07:01:00   e-mchc     1557            3003338
#>  5 FG00000001      71.5 2022-09-22 11:53:00   e-mchc     1557            3003338
#>  6 FG00000001      71.5 2022-09-12 12:03:00   s-ca-ion   9010            3016431
#>  7 FG00000001      71.5 2022-09-07 06:00:00   b-trom     2791            3007461
#>  8 FG00000001      71.4 2022-08-31 17:39:14   u-eryt     11344           3002582
#>  9 FG00000001      71.4 2022-08-26 08:57:00   e-mchc     1557            3003338
#> 10 FG00000001      71.4 2022-08-22 08:32:00   b-eryt     1101341         3026361
#> 11 FG00000001      71.4 2022-08-18 07:00:00   b-trom     2791            3007461
#> 12 FG00000001      71.4 2022-08-04 07:26:00   b-hba1c    6128            3004410
#> 13 FG00000001      71.4 2022-08-03 08:52:00   p-krea     2142                 -1
#> 14 FG00000001      71.3 2022-07-18 07:07:00   -h-ind     20536                -1
#> 15 FG00000001      71.3 2022-07-18 04:45:00   b-trom     2791            3007461
#> 16 FG00000001      71.3 2022-06-20 08:40:00   b-eryt     1341            3026361
#> 17 FG00000001      71.2 2022-06-16 10:23:00   p-k        1999            3023103
#> 18 FG00000001      71.2 2022-06-15 09:02:00   b-trom     2791            3007461
#> 19 FG00000001      71.2 2022-06-02 07:31:00   p-afos     4587            3035995
#> 20 FG00000001      71.1 2022-05-09 00:22:00   pt-gluk-r1 6001483              -1
#> # ℹ 13 more variables: MEASUREMENT_VALUE <dbl>, MEASUREMENT_UNIT <chr>,
#> #   MEASUREMENT_VALUE_HARMONIZED <dbl>, MEASUREMENT_UNIT_HARMONIZED <chr>,
#> #   TEST_OUTCOME <chr>, MEASUREMENT_STATUS <chr>,
#> #   REFERENCE_RANGE_LOW_VALUE <dbl>, REFERENCE_RANGE_HIGH_VALUE <dbl>,
#> #   CODING_SYSTEM_OID <chr>, TEST_ID_SOURCE <chr>, TEST_NAME_SOURCE <chr>,
#> #   MEASUREMENT_VALUE_SOURCE <dbl>, MEASUREMENT_UNIT_SOURCE <chr>
(same table in html format for exporation)

Adding info to OMOP_CONCEPT_ID codes

Function fg_bq_append_concept_info_data adds a new column with information about the OMOP_CONCEPT_ID name.

event_for_subject_1_with_translations <- fg_bq_append_concept_info_data(
  project_id, kanta_events_for_subject_1, omop_schema)
bq_table_download(event_for_subject_1_with_translations) |>  
  dplyr::select(FINNGENID, EVENT_AGE, APPROX_EVENT_DATETIME, TEST_NAME,  TEST_ID, OMOP_CONCEPT_ID, omop_concept_name) 
#> # A tibble: 20 × 7
#>    FINNGENID  EVENT_AGE APPROX_EVENT_DATETIME TEST_NAME  TEST_ID OMOP_CONCEPT_ID
#>    <chr>          <dbl> <dttm>                <chr>      <chr>             <int>
#>  1 FG00000001      71.7 2022-12-07 12:07:00   pt-gfre    656400               -1
#>  2 FG00000001      71.6 2022-11-04 16:18:00   p-crp      1605013         3020460
#>  3 FG00000001      71.6 2022-10-17 12:00:11   b-trom     3002791         3007461
#>  4 FG00000001      71.6 2022-10-15 07:01:00   e-mchc     1557            3003338
#>  5 FG00000001      71.5 2022-09-22 11:53:00   e-mchc     1557            3003338
#>  6 FG00000001      71.5 2022-09-12 12:03:00   s-ca-ion   9010            3016431
#>  7 FG00000001      71.5 2022-09-07 06:00:00   b-trom     2791            3007461
#>  8 FG00000001      71.4 2022-08-31 17:39:14   u-eryt     11344           3002582
#>  9 FG00000001      71.4 2022-08-26 08:57:00   e-mchc     1557            3003338
#> 10 FG00000001      71.4 2022-08-22 08:32:00   b-eryt     1101341         3026361
#> 11 FG00000001      71.4 2022-08-18 07:00:00   b-trom     2791            3007461
#> 12 FG00000001      71.4 2022-08-04 07:26:00   b-hba1c    6128            3004410
#> 13 FG00000001      71.4 2022-08-03 08:52:00   p-krea     2142                 -1
#> 14 FG00000001      71.3 2022-07-18 07:07:00   -h-ind     20536                -1
#> 15 FG00000001      71.3 2022-07-18 04:45:00   b-trom     2791            3007461
#> 16 FG00000001      71.3 2022-06-20 08:40:00   b-eryt     1341            3026361
#> 17 FG00000001      71.2 2022-06-16 10:23:00   p-k        1999            3023103
#> 18 FG00000001      71.2 2022-06-15 09:02:00   b-trom     2791            3007461
#> 19 FG00000001      71.2 2022-06-02 07:31:00   p-afos     4587            3035995
#> 20 FG00000001      71.1 2022-05-09 00:22:00   pt-gluk-r1 6001483              -1
#> # ℹ 1 more variable: omop_concept_name <chr>
(same table in html format for exporation)