Skip to content

Conversation

@elliVM
Copy link
Contributor

@elliVM elliVM commented Dec 18, 2025

Description

Logtime function throws exception in case functions fail, forcing the sql to run in throw none mode.

This change aims to allow the logtime epoch from path string extraction function to return 0 epoch instead of throwing an exception when used SQL functions fail.

  • Extracts the logtime function into it's own object for easier unit testing
  • Wrap in SQL COALESCE to provide 0 when getting NULL
  • IFNULL check for SUBSTRING and use 1970010100 as fallback value

@elliVM elliVM self-assigned this Dec 18, 2025
@elliVM elliVM linked an issue Dec 18, 2025 that may be closed by this pull request
@elliVM elliVM requested a review from Tiihott December 18, 2025 14:01
Field<Long> asField() {
final String unixTimestamp = "UNIX_TIMESTAMP(STR_TO_DATE("
+ "IFNULL(SUBSTRING(REGEXP_SUBSTR({0},'^\\\\d{4}\\\\/\\\\d{2}-\\\\d{2}\\\\/[\\\\w\\\\.-]+\\\\/([^\\\\p{Z}\\\\p{C}]+?)\\\\/([^\\\\p{Z}\\\\p{C}]+)(-@)?(\\\\d+|)-(\\\\d{4}\\\\d{2}\\\\d{2}\\\\d{2})'), -10, 10), '1970010100'), '%Y%m%d%H'))";
return DSL.field("COALESCE(" + unixTimestamp + ", 0)", Long.class, pathField);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making 1970 as a magick value is not really an option, better way would be to return a "sql null" in case extraction is not possible.

using

SELECT REGEXP_REPLACE(         '2010/01-08/sc-99-99-14-40/f17_v2/f17_v2.logGLOB-2010011601.log.gz','^\\d{4}\\/\\d{2}-\\d{2}\\/[\\w\\.-]+\\/([^\\p{Z}\\p{C}]+?)\\/([^\\p{Z}\\p{C}]+)(-@)?(\\d+|)-(\\d{4}\\d{2}\\d{2}\\d{2}).*',         '\\5' );
REGEXP_REPLACE(         '2010/01-08/sc-99-99-14-40/f17_v2/f17_v2.logGLOB-2010011601.log.gz','^\\d{4}\\/\\d{2}-\\d{2}\\/[\\w\\.-]+\\/([^\\p{Z}\\p{C}]+?)\\/([^\\p{Z}\\p{C}]+)(-@)?(\\d+|)-(\\d{4}\\d{2}\\d{2}\\d{2}).*',         '\\5' );

for example would provide better results. we should rather check that if extraction was successful and that it parses to date and then to unix date.

these can be verified with if(value regex 'valid value', unixtime(value), null) style

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

avoid throwing sql exceptions in logtimeFunction

2 participants