• R/O
  • SSH

提交

标签
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

Commit MetaInfo

修订版c1a8b701554dd8371c2d9274a2ea4d79ae3bf908 (tree)
时间2025-01-01 02:33:41
作者Lorenzo Isella <lorenzo.isella@gmai...>
CommiterLorenzo Isella

Log Message

I added a script showcasing how to work on a csv file both with arrow and duckplyr.

更改概述

差异

diff -r 646bddfa8bc1 -r c1a8b701554d R-codes/duckplyr_test.R
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/R-codes/duckplyr_test.R Tue Dec 31 18:33:41 2024 +0100
@@ -0,0 +1,40 @@
1+rm(list=ls())
2+library(tidyverse)
3+library(duckplyr)
4+library(tictoc)
5+library(arrow)
6+
7+
8+# Increased the size of data
9+## dd <- tibble(x=1:100000000, y=rep(LETTERS[1:20], 5000000))
10+
11+
12+## write_csv(dd, "test.csv")
13+
14+
15+df <- duck_csv("test.csv")
16+
17+system.time({
18+df_stat <- df |>
19+ summarise(total=sum(x), .by = y) |>
20+ collect() |>
21+ as_tibble()
22+
23+})
24+
25+
26+df2 <- open_dataset("test.csv",
27+ format = "csv",
28+ skip_rows = 0)
29+
30+system.time({
31+ df_stat2 <- df2 |>
32+ group_by(y) |>
33+ summarise(total=sum(x)) |>
34+ ungroup() |>
35+ collect()
36+
37+})
38+
39+
40+print("So far so good")